• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

DigitalFoundry: X1 memory performance improved for production console/ESRAM 192 GB/s)

Status
Not open for further replies.

Freki

Member
So what? I don't know how memory controllers work. Do you? What makes you think that this is impossible?

Actually - yes ^^.
I think this is impossible because every cycle is the same - suggesting something would only work 88% of the cycles doesn't make much sense...
 
88% is hardly every few cycles.

Edit: Also, wouldn't this kind of fluctuating bandwidth cause trouble for devs?

I should have been more clear. I was also going based on an interpretation that I read on beyond3d, as well as the article.

Actually - yes ^^.
I think this is impossible because every cycle is the same - suggesting something would only work 88% of the cycles doesn't make much sense...

I'm no expert myself, but it seems to be implying that there are cycles in which both a read and write are possible, meaning that not every cycle is the same all of the time.

However, with near-final production silicon, Microsoft techs have found that the hardware is capable of reading and writing simultaneously. Apparently, there are spare processing cycle "holes" that can be utilised for additional operations. Theoretical peak performance is one thing, but in real-life scenarios it's believed that 133GB/s throughput has been achieved with alpha transparency blending operations (FP16 x4).

Or even if there is a bit of confusion on my part, and all its really stating is that ESRAM, under certain scenarios, can demonstrate greater memory performance similar to the EDRAM on the Xbox 360, then we are still left with the same end conclusion, that ESRAM memory bandwidth performance is a bit better than originally anticipated. I also just realized that in the official Durango documentation that I'm looking at, the same alpha transparency blending example that's mentioned in the Eurogamer article was pointed out specifically by Microsoft as something that was possible thanks to the 32MB of ESRAM, so this doesn't entirely appear to be coming out of left field. I'm guessing this was originally an MS claim that, thanks to more final looking hardware, devs are now finding to have some real merit?
 

Myshoe

Banned
Ignoring the peak theoretical performance isn't the most important part of the article this litte bit? (if true)

Theoretical peak performance is one thing, but in real-life scenarios it's believed that 133GB/s throughput has been achieved with alpha transparency blending operations (FP16 x4).

Isn't that confirmation of best case real world performance? won't it be lower in normal cases given that 133GB/s comes from performing a specific operation?
 
Really?



On the whole, the article reads like there are instances every few cycles in which more bandwidth exceeding previously believed performance levels is achievable. With very low latency this sounds like a pretty convenient situation to have, as the lower the latency, the quicker the cycles in which you can achieve that extra memory bandwidth performance is repeatable.



DF is regularly quoted here as a fairly trusted and reliable source. it does seem odd that when the news sounds generally positive the pitchforks come out. Also, people shouldn't be acting as if this is somehow anything new. The Xbox 360 already did something similar to this with its EDRAM. The Xbox 360 only ever achieved that 256GB/s of memory bandwidth performance under specific conditions. In most other cases, the bandwidth was much, much lower than that.
so what could the specific condition be like now? Free 16xmsaa and af? The rest is constant 68gb/s right?
 

ekim

Member
Ignoring the peak theoretical performance isn't the most important part of the article this litte bit? (if true)



Isn't that confirmation of best case real world performance? won't it be lower in normal cases given that 133GB/s comes from performing a specific operation?

Afaik this was stated in the leaked slides from the Durango summit that full speed FP16x4 blend operations are possible under special (unknown) conjunctures.
 
rather 4xMSAA considering the size of the ESRam.
Ok makes sense but thats it? 4xmsaa? This is probably going to be the more straightforward approach 3rd party games are going to take. Then games will look worse because of 68gb vs 176gb constant? It's what i'm getting out of this. I wish ms would just come out with hard facts.
 

KHarvey16

Member
This doesn't sound like bidirectional or simultaneous read/writes were ever physically prohibited but simply not implemented due to timing budget constraints. That kind of revelation this late in the process is certainly not impossible.
 

watership

Member
its hilarious that DF has been quoted here through the years as a reliable source for hardware news. But the second that they post something that goes against the current Train of Bashing people are dismissing it as bullshit.

In the past month since the One announcement anything can be twisted to serve as the tool of confirmation bias. Kotaku was suddenly an accurate source. Jim Stirling becoming an internet hero. Adam Sessler is a shill for Microsoft (wtf). There is no record of credibility, there is no record of quality. Whomever is on 'my side' has credibility. This is the new paradigm for next gen.
 
Ignoring the peak theoretical performance isn't the most important part of the article this litte bit? (if true)



Isn't that confirmation of best case real world performance? won't it be lower in normal cases given that 133GB/s comes from performing a specific operation?

it's just AN example. others could be found where you get higher, however, the fact is you'll only approach the theoretical max if you are doing equal amounts of reads and writes at the same time, ALL the time.

so clearly it's not going to happen.
 

charsace

Member
Why are people always bringing up the 800mb Killzone reserved render target memory? None of you that post about it know what they are using it for and just blindly post it because its a big number. The render target memory is being used for more than just a 1080p frame back buffer. So unless you know something substantial about why they need that size for the render targets (ex the amount of render targets stored or the sizes of the render targets or the purpose of them) it just looks stupid to when the number is randomly thrown out.
 

Harp

Member
I thought the downclock had to do with the GPU? isnt this taking about memory bandwith. I would get to excited about engineers finding more bandwith in closer to final hardware. Because until it is the actual finally hardware they could find the next batch to provide slower performance. They wont know until the production hardware is released.
 

hal9001

Banned
Reads more like creative accounting to disguise a GPU downclock from 800MHz by 50MHz.

750 (freq) * 128 (bits) = 96 GB/s
96 GB/s * 2 (simultaneous read/write) = 192 GB/s

How much does MS pay you to write this garbage, Leadbetter?

Some people in the comments section of Digital Foundry take it too personally sometimes. This next Gen war is going to be brutal.
 
Afaik this was stated in the leaked slides from the Durango summit that full speed FP16x4 blend operations are possible under special (unknown) conjunctures.

You are correct. I'm looking at it as we speak. MS specifically said 'Full speed FP16x4 writes are possible in limited circumstances' It was one of the highlighted implications of using 32MB of generic ESRAM.
 
Why not?(honest question) It wasn't possible with DX9 afaik but DX11 should allow that.

I thought that DX11 MSAA on a deferred renderer still requires considerably more VRAM and performance cost than it would on a forward renderer. It's the reason why it's so crippling on games like Hitman on PC.

As such I would assume that applying the old rules about the sort of free MSAA you could get out of your ESRAM no longer stand when you're talking about deferred renderers.
 

Wynnebeck

Banned
its hilarious that DF has been quoted here through the years as a reliable source for hardware news. But the second that they post something that goes against the current Train of Bashing people are dismissing it as bullshit.

Gotta love that confirmation bias.
 

ekim

Member
I think he means that on a deferred renderer frame buffer is going to be so big that you can not fit it in esram

Well I'm probably wrong but I thought shaders are able to access single samples in a MS target and thus the size shouldn't be a problem. :eek:

edit: ok - I am. :(
 
Actually - yes ^^.
I think this is impossible because every cycle is the same - suggesting something would only work 88% of the cycles doesn't make much sense...

In fact, 88% efficiency would be awesome. i7's IMC's miss more than 20% of the peak theoretical performance between overhead and missed cycles. And those are the top end desktop parts, not budget chips.

I don't understand this 'news'. It was already know that esRAM has bidirectional bandwidth.
 

Truant

Member
Great stuff if true. The more parity the better. I don't want my favorite developers ambitions being lowered because of one underpowered console.
 

Vespene

Member
Fact of the matter is most multiplatform developers build their game to run on the lesser hardware, thus not taking full advantage of the more powerful console.
 

Freki

Member
In fact, 88% efficiency would be awesome. i7's IMC's miss more than 20% of the peak theoretical performance between overhead and missed cycles. And those are the top end desktop parts, not budget chips.

I don't understand this 'news'. It was already know that esRAM has bidirectional bandwidth.
The 88% isn't about efficeny - it's about what the peak perfomance should be.
DF states that 192GB/s is the peak performance for bi-directional memory access whereas it should be 204GB/s going by the previous 102GB/s uni-directional peak performance ...

133GB/s is the believed real-life performance you were looking for...
 

ekim

Member
Fact of the matter is most multiplatform developers build their game to run on the lesser hardware, thus not taking full advantage of the more powerful console.

Especially first gen games - yeah.
But as we have a pretty similar architecture, we should expect higher frame rates/resolutions for PS4.
 
Fact of the matter is most multiplatform developers build their game to run on the lesser hardware, thus not taking full advantage of the more powerful console.
That's not true. Gaming through multiple generations sow that weaker hardware gets worse quality ports.
 

charsace

Member
I thought that DX11 MSAA on a deferred renderer still requires considerably more VRAM and performance cost than it would on a forward renderer. It's the reason why it's so crippling on games like Hitman on PC.

As such I would assume that applying the old rules about the sort of free MSAA you could get out of your ESRAM no longer stand when you're talking about deferred renderers.
Since around 2008 people have been saying deferred rendering is the future. DX/OGL have taken DR in to account for years. Does it make any sense to you that the company that makes DX and has a system that is basically the DirectXbox wouldn't build a system that could do deferred rendering at 1080p?
 

coldfoot

Banned
Does DF have a credibility problem?
They have a bias problem.
Only thing that you should take from their articles is that ESRAM can now do a read/write concurrently, everything else is plain Xbox puff piece, along with the fallacy of DDR3 having lower latency than GDDR5 and this magical SHAPE audio chip.

Just read the news on DF and ignore all the stupid commentary from Dick.
 

Spongebob

Banned
Fact of the matter is most multiplatform developers build their game to run on the lesser hardware, thus not taking full advantage of the more powerful console.
This isn't going to be like previous generations where the architectures of the consoles are completely different.

Taking advantage of the PS4's extra power is going to be relatively trivial.
 
So did the 360.
360 didn't have the CELL.
Not that surprising, but nice to hear. Remember that DirectX 11.2 only got released very recently and it's supported on the XB1 so it's obvious that they're iterating fairly rapidly.



176GB/s figure before was for combined peak eSRAM and peak DDR3 bandwidth. The new 192GB/s is for peak eSRAM only, an increase from 102GB/s.

Before
DDR3 peak: 68GB/s
eSRAM peak: 102GB/s
Combined peak: 176GB/s

Now
DDR3 peak: 68GB/s
eSRAM peak: 192GB/s
eSRAM realistic: 133GB/s
Combined peak (unrealistic): 260GB/s
Xbone more powerful than PS4 has been comfirmed GG Sony.
Joking, dont come after me Canadian woman.
I timed stamped the Mark Cerny EDRAM explanation.

http://www.youtube.com/watch?v=JJW5OKbh0WA#t=40m

Thanks
 
Status
Not open for further replies.
Top Bottom