• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

About the Xbone's ESRAM size ..

tinfoilhatman

all of my posts are my avatar
My understanding is the 128MB memory in haswell is a system managed general cache - whereas the 32MB in xb1 is a user managed scratch pad.

If this is the case, the implication of this is subtle but important. For example, if you wanted to use the 32MB as a temporary location for render target data (pixel storage, effectively) then you are going to potentially run out of space - just like the 10MB edram buffer in the 360 limited resolution.

Hypothetical examples for a 1920x1080 frame buffer:

forward rendered FP16 HDR with 2x MSAA:
(8b colour + 4b depth/stencil) x 2 x 1920 x 1080 = 47MB

typical (eg frostbite) deferred renderer g-buffer, 4 MRTs each at 32bpp (no MSAA):
(4b mrt x 4 + 4b depth/stencil) x 1920 x 1080 = 39MB


This doesn't necessarily mean these cases are impossible - you could render the scene in tiles or leave some buffers in DDR - but it does add a significant layer of complexity (it won't 'just work' efficiently and automatically like the haswell cache).

The other concern I have is that it doesn't mitigate the need to copy data in/out of ESRAM - which still will be limited by DDR bandwidth. So using ESRAM will only make sense in cases where you are reading/writing the memory a large number of times within the frame - *and* those reads are often missing the on-chip caches (which in a well designed renderer isn't as common as you'd think).


Does any of this take into account, tiling, deferred rendering\virtual texturing?

This seems to be a MAJOR factor with the X1 hardware design and custom stripped down X1 DirectX version.
 

Chobel

Member
What? You don't believe em or something El Chobel?

Me asking that question doesn't mean I don't believe them. I want an explanation, and this thread looks the best place to ask those questions.

There will continue to be a lot of skepticism about the bandwidth numbers because no one seems to understand how MS comes to them, and MS's response never explains how they come to them. They just say, "That's what they are," and don't engage further with it, even when there's push back. Seems baffling.
And of course,^This^.
 
What? You don't believe em or something El Chobel?

There will continue to be a lot of skepticism about the bandwidth numbers because no one seems to understand how MS comes to them, and MS's response never explains how they come to them. They just say, "That's what they are," and don't engage further with it, even when there's push back. Seems baffling.
 

wsippel

Banned
The intended use(s) for the SRAM becomes clear when you consider why they didn't choose eDRAM instead.
That makes no sense. SRAM and pseudo-static eDRAM (what Nintendo uses) behave completely identical. Microsoft has to use SRAM because neither TSMC nor GlobalFoundries have 28nm eDRAM tech.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
Does any of this take into account, tiling, deferred rendering\virtual texturing?

The g-buffer he mentioned is in particular a consequence of deferred rendering. PRT has no influence on these calculations.

That makes no sense. SRAM and pseudo-static eDRAM (what Nintendo uses) behave completely identical. Microsoft has to use SRAM because neither TSMC nor GlobalFoundries have 28nm eDRAM tech.

The implications of that fact can be seen in Nintendo's case:

"Factory responsible for Wii U eDram to close within 2-3 years"
http://www.neogaf.com/forum/showthread.php?t=643667
 

RSTEIN

Comics, serious business!
My understanding is that the z-buffer acts as a synthetic GPU phased to the APU geoshell. The pseudo-static cache can fetch multitextual buffers on the fly, easily allowing total bandwidth to hit 204 GB/s number.
 
This phrase is... so full of contradictions. 'is how microsoft arrives at 200 GB/s but in reality it's not how any of this works"
O_O
So xbone arrives at it's 200 GB because... it's a lie?
That statement is plain wrong. The more than 200/GB/s figure never was considering the coherent bandwidth...

Not that's not kinda fishy, because that figure considers some simultaneous read/write on esram and Ms haven't explained yet how this is actually possible...
 

tfur

Member
So, to be clear about the bandwidth, and to use terminology that is used in other areas of hardware design. Using rounded numbers here:

Is the ESRAM peak interconnect ~200 GB/s on read OR write?
If not, the normal way of describing peak bandwidth would be: ~200 GB/s (~100 GB/s per direction).

Maybe it is ~200 GB/s one way. If so, it should be an easy thing for the "technical fellow" to answer.
 

DC R1D3R

Banned
Me asking that question doesn't mean I don't believe them. I want an explanation, and this thread looks the best place to ask those questions.


And of course,^This^.

Well I don't know about you guys but I believe them. I may not be all technically astute as the rest of yous but......it all adds up to me Chobs.
 
Well I don't know about you guys but I believe them. I may not be all technically astute as the rest of yous but......it all adds up to me Chobs.

I'm sure I would have taken MS's words at face value too except for the cavalcade of tech outlets and GAF users who have all continued to scratch their heads over the numbers. If MS had a simple explanation, I would think they would have provided it to quell those voices, but as it stands, it's MS arguing that they know better than everyone else so they should just be trusted. That just doesn't sit right with me.
 

astraycat

Member
My understanding is that the z-buffer acts as a synthetic GPU phased to the APU geoshell. The pseudo-static cache can fetch multitextual buffers on the fly, easily allowing total bandwidth to hit 204 GB/s number.
Wow! Can you also explain to us how time is a cube?
 

DC R1D3R

Banned
I'm sure I would have taken MS's words at face value too except for the cavalcade of tech outlets and GAF users who have all continued to scratch their heads over the numbers. If MS had a simple explanation, I would think they would have provided it to quell those voices, but as it stands, it's MS arguing that they know better than everyone else so they should just be trusted. That just doesn't sit right with me.

artist said:
7 years ago I believed Larry - http://i.imgur.com/EiyQVAi.gif

Not again.

PMSL
Ok you guys are cracking me up man!

:]
 

TheSoviet

Neo Member
I'm still looking forward to Albert Penello getting back to us with his "technical fellow" answers.

Not just from what he said the other night there but to actually compare the two consoles to get a proper comparison done.
 
PMSL
Ok you guys are cracking me up man!

:]

It's so cute watching your replies. People prove to you that microsoft did in fact spin hardware numbers with the 360 and the only thing you can come up with (well, that's like 99% of your posts anyway) is some drivel worthy of being on gamefaqs, like "PSML".
Yikes.
 

tipoo

Banned
This is why Intel went with 128MB for their cache even though 32MB would be fine for current workloads on Iris Pro 5200.
 
PMSL
Ok you guys are cracking me up man!

:]

Believe what you want to believe, it's clear no one could convince you to question what MS says. But there's lots of skepticism. Ars Technica just the other day posted a commentary on the Penello's bandwidth claims:

Penello: We have more memory bandwidth. 176gb/sec is peak on paper for GDDR5. Our peak on paper is 272gb/sec. (68gb/sec DDR3 + 204gb/sec on ESRAM). ESRAM can do read/write cycles simultaneously so I see this number mis-quoted.

Ars Technica: Just adding up bandwidth numbers is idiotic and meaningless. While the Xbox One's ESRAM is a little faster, we don't know how it's used, and the PS4's GDDR5 is obviously a lot bigger.

Plenty more skepticism about the other numbers is his full statements too to be found in the article.

http://arstechnica.com/gaming/2013/09/microsoft-exec-defends-xbox-one-from-accusations-its-underpowered/
 
Maybe we can ask a mod to mod the thread?

I think you should, but how can we make that thread not go to shit?

Maybe it will stimulate the rising of the "technical fellow."

I may of just renamed my penis

I messaged Bish to ask thoughts. If I don't hear anything back, I'll probably just go ahead with it and watch as hell descends.

Its basically a summary of what was already covered in the gaf thread that they took his quotes from so yes it will get locked

The addition is that it includes Ars' take on Penello's statements.
 

TRios Zen

Member
I messaged Bish to ask thoughts. If I don't hear anything back, I'll probably just go ahead with it and watch as hell descends.



The addition is that it includes Ars' take on Penello's statements.

So I'm pretty sure that this discussion has already occurred. Ars Technica saying that they don't believe Penello's statement would just rehash old arguments IMO.

Any true NEW discussion on this issue could only occur after some further explanation from MS is provided as to how they get to these bandwidth numbers, such that the validity of that could be bandied about. Just my two cents though.
 
So I'm pretty sure that this discussion has already occurred. Ars Technica saying that they don't believe Penello's statement would just rehash old arguments IMO.

Any true NEW discussion on this issue could only occur after some further explanation from MS is provided as to how they get to these bandwidth numbers, such that the validity of that could be bandied about. Just my two cents though.

That may be. The way I see it, people often don't see much validity to a posters on a forum, but this is a well known tech site that's taken a stand on the issue. It largely stands against Penello, and maybe that's the sentiment from NeoGAF, but this is a voice from outside. I haven't heard back from a mod, so I'll toss it up just the relevant Ars analysis and see what happens.

If it gets locked, it gets locked. Not going to talk about this anymore though because it's definitely strayed and is derailing the thread. Thanks for folk's input.
 
Well I don't know about you guys but I believe them. I may not be all technically astute as the rest of yous but......it all adds up to me Chobs.

image.php
 

tensuke

Member
Not sure what you even mean by that ..

PMSL.
Basically he finds it funny that anyone would disagree with MS about their numbers, even though they don't add up with what we know, and MS won't go into more detail (yet) about how they do. If Albert can get the TF to answer some questions directly, I think these questions could be alleviated one way or another. I don't see what's wrong in at least discussing what we know to try and make sense of everything.
 
If we know anything at all about console development it's that

a) they are always outdated quicker than we hope

b) developers are always crafty about finding workarounds

You could have said the exact same thing about the PS3 having 256MB of main RAM and 256MB VRAM in 2005 compared with the 6-8GB main RAM and 2GB+ VRAM they have now. And it still managed this:

1313_The%20Last%20of%20Us%204.jpg


I honestly don't see why this is different.

Bullshot... I have TLOU and a great HDTV and it doesn't look this good.
 

wsippel

Banned
This is about Wii U, but the memory architecture is quite similar, so I guess it makes sense to post this quote from Shin'en here:

We've gone deferred + HDR for our second generation Wii U engine. Very simple and fast on Wii U, because all render targets fit in eDRAM. Nano Assault Neo used forward rendering because we were afraid deferred would be too slow for 60fps. Fortunately, it works great on Wii U.
So, the thing is: if you're going deferred, you want a really fast memory pool. It doesn't have to be all that big as long as it's fast. Both Microsoft and Nintendo went with that approach - small, extremely fast local memory pools to store render targets (Nintendo has a bit more embedded memory, 35MB versus 32MB, but we don't know how fast it is - it also has to drive two screens). 360 had eDRAM as well, but it was too small and too limited to efficiently deal with multiple render targets.
 

NBtoaster

Member
This is about Wii U, but the memory architecture is quite similar, so I guess it makes sense to post this quote from Shin'en here:


So, the thing is: if you're going deferred, you want a really fast memory pool. It doesn't have to be all that big as long as it's fast. Both Microsoft and Nintendo went with that approach - small, extremely fast local memory pools to store render targets (Nintendo has a bit more embedded memory, 35MB versus 32MB, but we don't know how fast it is - it also has to drive two screens). 360 had eDRAM as well, but it was too small and too limited to efficiently deal with multiple render targets.

Ideally it needs to be big enough to handle many render targets at the typical resolution. Wii U benefits from a 720p target mostly so 32MB should be enough, but as you can see from the calculations above at 1080p they can easily exceed 32MB.
 

wsippel

Banned
Ideally it needs to be big enough to handle many render targets at the typical resolution. Wii U benefits from a 720p target mostly so 32MB should be enough, but as you can see from the calculations above at 1080p they can easily exceed 32MB.
I don't know how relevant that is, but Xenos could only handle up to four render targets to begin with, and with tons of voodoo, it was possible to do that even with just 10MB of eDRAM, for a 720p target. As far as I know, all next-gen systems can handle up to eight simultaneous render targets, and both Xbone and Wii U have more than three times the embedded memory. To me as a layman, that sounds like it should work.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
I don't know how relevant that is, but Xenos could only handle up to four render targets to begin with, and with tons of voodoo, it was possible to do that even with just 10MB of eDRAM, for a 720p target. As far as I know, all next-gen systems can handle up to eight simultaneous render targets, and both Xbone and Wii U have more than three times the embedded memory. To me as a layman, that sounds like it should work.

The problem is that the size of render targets increases with resolution, and next-gen games will preferably go for 1080p. 1920 * 1080 * 32bit ~= 7,9MB
 
Top Bottom