• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

About the Xbone's ESRAM size ..

artist

Banned
We all know Microsoft's strategy in choosing the memory architecture for the Xbone - split into the main DDR3 and on-die ESRAM pools. This thread isnt about that. Its not about comparison with PS4 APU either. I'm talking about the size Microsoft choose for the ESRAM - 32MB.

I'll start off with a quote from Anandtech's Xbone Arch analysis;
Anandtech said:
To make up for the gap, Microsoft added embedded SRAM on die (not eDRAM, less area efficient but lower latency and doesn't need refreshing). All information points to 32MB of 6T-SRAM, or roughly 1.6 billion transistors for this memory. It’s not immediately clear whether or not this is a true cache or software managed memory. I’d hope for the former but it’s quite possible that it isn’t. At 32MB the ESRAM is more than enough for frame buffer storage, indicating that Microsoft expects developers to use it to offload requests from the system memory bus. Game console makers (Microsoft included) have often used large high speed memories to get around memory bandwidth limitations, so this is no different. Although 32MB doesn’t sound like much, if it is indeed used as a cache (with the frame buffer kept in main memory) it’s actually enough to have a substantial hit rate in current workloads (although there’s not much room for growth).

Vgleaks has a wealth of info, likely supplied from game developers with direct access to Xbox One specs, that looks to be very accurate at this point. According to their data, there’s roughly 50GB/s of bandwidth in each direction to the SoC’s embedded SRAM (102GB/s total bandwidth). The combination of the two plus the CPU-GPU connection at 30GB/s is how Microsoft arrives at its 200GB/s bandwidth figure, although in reality that’s not how any of this works.
So Anand notes that if ESRAM is used as a cache (which it is) then it will be of significant perf benefit in terms of current workloads but also notes that there is not much room for growth.

Moving onto an another Anand piece on the GT3e (Iris Pro 5200) ..
Anandtech said:
There’s only a single size of eDRAM offered this generation: 128MB. Since it’s a cache and not a buffer (and a giant one at that), Intel found that hit rate rarely dropped below 95%. It turns out that for current workloads, Intel didn’t see much benefit beyond a 32MB eDRAM however it wanted the design to be future proof. Intel doubled the size to deal with any increases in game complexity, and doubled it again just to be sure. I believe the exact wording Intel’s Tom Piazza used during his explanation of why 128MB was “go big or go home”. It’s very rare that we see Intel be so liberal with die area, which makes me think this 128MB design is going to stick around for a while.

The 32MB number is particularly interesting because it’s the same number Microsoft arrived at for the embedded SRAM on the Xbox One silicon. If you felt that I was hinting heavily at the Xbox One being ok if its eSRAM was indeed a cache, this is why. I’d also like to point out the difference in future proofing between the two designs.
So on one hand we have a console APU that needs to be future-proofed for atleast 5 years and Microsoft chose 32MB and on the other hand we have a very miserly Intel which is very conservative with it's die-sizes and they decided to future-proof theirs with 128MB.

To me, this just reads like a non-issue right now but years down the road, devs may need to play the optimize game more and more with the 32MB of ESRAM.
 

Orayn

Member
Do we know if the 32MB of eSRAM is backed up by anything that would give the potential for "free" AA like the 10MB of eDRAM on the Xbox 360, or is that dream completely dead at this point? I thought it was sad that devs seemed to give up on that so early in the 360's lifespan.
 

oVerde

Banned
Someone (here on GAF) told me once, 32mb is so low that it couldn't even do a frame double buffer to prevent screen tear (considering 1080p output), I don't know if it's right or not, but 32mb is a lot to cache many many things.

I'm programming my own game but didn't step into the optimization area yet, so my numbers aren't that reliable.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
It is definitely not a hardware-managed cache like the eDRAM in Haswell. It is mainly there to store render targets ("pixelbuffers") [1], just like on the 360. In addition, with the DME it seems to be able to temporarily hold pre-fetched textures from main memory.

[1] said:
- Texturing from ESRAM
- Rendering to surfaces in main RAM
- Read back from render targets without performing a resolve (in certain cases)

[1] http://www.vgleaks.com/durango-gpu-2/2/
[2] http://en.wikipedia.org/wiki/Render_Target
 

bobbytkc

ADD New Gen Gamer
Do we know if the 32MB of eSRAM is backed up by anything that would give the potential for "free" AA like the 10MB of eDRAM on the Xbox 360, or is that dream completely dead at this point? I thought it was sad that devs seemed to give up on that so early in the 360's lifespan.

The 'free' AA ON THE 360 is just fudging of specs because the bandwidth of the edram is so large that the additional bandwidth of implementing AA is negligible. However, AA does incur a considerable computational cost, which is not in anyway 'free'.

If you want to follow this terminology though, the AA will actually be less 'free' on the XBone than the 360 because the bandwidth of the framebuffer is less around 50% of the 360's edram
 

malfcn

Member
I wish I understood more of what all this means. When I hear 32MB it sounds tiny, but people in the know get all excited about it.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
The 'free' AA ON THE 360 is just fudging of specs because the bandwidth of the edram is so large that the additional bandwidth of implementing AA is negligible.

Certain AA is free on the 360 because the 360's eDRAM has additional hardware functions that implement that.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
I wish I understood more of what all this means. When I hear 32MB it sounds tiny, but people in the know get all excited about it.

In typical rendering workflows you read and write a lot from and into pixelbuffers. Those buffers have sizes that depend (1) on the target resolution and (2) on the information stored per pixel (e.g. color). Buffers may holdthe actual screen content or intermediate information in two-pass renderers [1]. The size of such buffers is the product of the number of pixels and the information per pixel, e.g. ~23,8MB at 1080p with 12 bytes of information per pixel. So a pool o 32MB is big enough for many scenarios but might be limited for some. KZ:SF, for instance, has render targets with over 40MB of total size.

Nevertheless, statements like "Only 32MB, lol" are plain wrong.

[1] http://en.wikipedia.org/wiki/Deferred_shading
 

panda-zebra

Banned
Someone (here on GAF) told me once, 32mb is so low that it couldn't even do a frame double buffer to prevent screen tear (considering 1080p output), I don't know if it's right or not, but 32mb is a lot to cache many many things.

I'm programming my own game but didn't step into the optimization area yet, so my numbers aren't that reliable.

32mb is plenty to double buffer 1920x1080x32bit. esram was designed for frame buffer just like edram on wiiu and can be used as scratchpad.
 

Orayn

Member
Someone (here on GAF) told me once, 32mb is so low that it couldn't even do a frame double buffer to prevent screen tear (considering 1080p output), I don't know if it's right or not, but 32mb is a lot to cache many many things.

I'm programming my own game but didn't step into the optimization area yet, so my numbers aren't that reliable.

Back of the envelope calculations: 1280x720 frame with no AA = 7MB, so a 1920x1080 frame is 2.25x bigger at 15.75, which means 32 is just barely big enough to double buffer.

Edit: Based on some article about the 360 eDRAM.
 

cyberheater

PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 PS4 Xbone PS4 PS4
Apparently. A lot of post processing effects don't require much memory. So a low latency fast pool of memory like the eSRAM would very beneficial for those types of processing.
 

twobear

sputum-flecked apoplexy
If we know anything at all about console development it's that

a) they are always outdated quicker than we hope

b) developers are always crafty about finding workarounds

You could have said the exact same thing about the PS3 having 256MB of main RAM and 256MB VRAM in 2005 compared with the 6-8GB main RAM and 2GB+ VRAM they have now. And it still managed this:

1313_The%20Last%20of%20Us%204.jpg


I honestly don't see why this is different.
 
No more. My head hurts. Lets just play games.

If you don't like the topic get out. This topic is about hardware if you want to talk games go elsewhere.

You could have said the exact same thing about the PS3 having 256MB of main RAM and 256MB VRAM in 2005 compared with the 6-8GB main RAM and 2GB+ VRAM they have now. And it still managed this:

1313_The%20Last%20of%20Us%204.jpg


I honestly don't see why this is different.

It's not different. That's kind of the point. PS3 having split 256 main and video memory was a terrible idea in hind sight.
 
I remember reading something about the xbox360 edram how it was only 10mb because they thought developers weren't going to use deferred renderers. And if they went with 12mb it would have made a big difference? And that 10mb was only good for free aa so the xbox one's 32mb esram although it doesn't have as high a bandwidth as edram, will be more than enough in this coming gen.
 

Ricky_R

Member
I think we should let this sleep until Albert or the "Technical fellow" come here for an AMA or some clarifications. It will just get locked.
 
I think it's important to remember that the motivation for 32MB ESRAM is not about performance boost, but cost savings. Using it allows them to use relatively inexpensive DDR3 memory outside the APU, compared to more expensive GDDR5. There's a direct tradeoff between ESRAM and amount of CUs you can fit in an APU size. MS went for cost saving as opposed to performance.

The reason Intel uses EDRAM is because the PC architecture sticks them with a normal DDR3 interface at 1600 MHz. MS chose to stick themselves in that situation. They didn't have to. MS could have done a daughter die (like the Intel solution does) to avoid one giant die.
 

twobear

sputum-flecked apoplexy
It's not different. That's kind of the point. PS3 having split 256 main and video memory was a terrible idea in hind sight.

It might not have been ideal, but it also gave us TLOU.

If the 32MB EDRAM leads to stuff like Skyrim PS3 in a few years, I guess I'll revisit my opinion? Until then I don't see how this is much more than just FUD for FUD's sake.

I think it's important to remember that the motivation for 32MB ESRAM is not about performance boost, but cost savings. Using it allows them to use relatively inexpensive DDR3 memory outside the APU, compared to more expensive GDDR5. There's a direct tradeoff between ESRAM and amount of CUs you can fit in an APU size. MS went for cost saving as opposed to performance.

The reason Intel uses EDRAM is because the PC architecture sticks them with a normal DD3 interface at 1600 MHz. MS chose to stick themselves in that situation. They didn't have to. MS could have done a daughter die (like the Intel solution does) to avoid one giant die.
It clearly isn't about cost because Sony are paying less for the silicon in PS4. It's about wanting 8GB RAM for OS functions and not wanting to take a gamble on GDDR5 chip densities.

OK, I guess in a roundabout way it is about cost because they could have used twice the number of GDDR5 chips with half the density, but good luck selling a $600 console, MS.
 

ethomaz

Banned
The Xbone's eSRAM is not only for cache or framebuffer like eDRAM in 360. You can use the eSRAM for others things... you need to choose what is better.

So the developer needs to deal with eSRAM to extract the best result possible... I think games will use it in different ways.
 
The Xbone's eSRAM is not only for cache or framebuffer like eDRAM in 360. You can use the eSRAM for others things... you need to choose what is better.

So the developer needs to deal with eSRAM to extract the best result possible... I think games will use it in different ways.

Yeah, it's a fully programmable cache. It's the best it can be for what it is.

It might not have been ideal, but it also gave us TLOU.

If the 32MB EDRAM leads to stuff like Skyrim PS3 in a few years, I guess I'll revisit my opinion? Until then I don't see how this is much more than just FUD for FUD's sake.


It clearly isn't about cost because Sony are paying less for the silicon in PS4. It's about wanting 8GB RAM for OS functions and not wanting to take a gamble on GDDR5 chip densities.

OK, I guess in a roundabout way it is about cost because they could have used twice the number of GDDR5 chips with half the density, but good luck selling a $600 console, MS.

I'm going to need a source for Sony paying less for silicon. Especially when you factor in silicon + memory cost. Kinect silicon doesn't count because that's essentially an extra $100 charge.

MS could not have done double the number of GDDR5 chips. That would have been a 512 bit memory interface which is batshit insane for a console design.
 

benny_a

extra source of jiggaflops
In typical rendering workflows you read and write a lot from and into pixelbuffers. Those buffers have sizes that depend (1) on the target resolution and (2) on the information stored per pixel (e.g. color). Buffers may holdthe actual screen content or intermediate information in two-pass renderers [1]. The size of such buffers is the product of the number of pixels and the information per pixel, e.g. ~23,8MB at 1080p with 12 bytes of information per pixel. So a pool o 32MB is big enough for many scenarios but might be limited for some. KZ:SF, for instance, has render targets with over 40MB of total size.
I remember Timothy Lottes (ex-Nvidia, now Epic) saying that 32MB wasn't enough for forward rendering at 1080p.

The big studios are probably all on deferred rendering by now, I assume.
But will upcoming indies have an issue with this or was Timothy Lottes wrong? (I'm assuming straightforward forward rendering is still a thing.)
 

Lazy8s

The ghost of Dreamcast past
The intended use(s) for the SRAM becomes clear when you consider why they didn't choose eDRAM instead.
 

udiie

Member
In typical rendering workflows you read and write a lot from and into pixelbuffers. Those buffers have sizes that depend (1) on the target resolution and (2) on the information stored per pixel (e.g. color). Buffers may holdthe actual screen content or intermediate information in two-pass renderers [1]. The size of such buffers is the product of the number of pixels and the information per pixel, e.g. ~23,8MB at 1080p with 12 bytes of information per pixel. So a pool o 32MB is big enough for many scenarios but might be limited for some. KZ:SF, for instance, has render targets with over 40MB of total size.

This is actually very well explained for us simple folk- thank you.
 
It might not have been ideal, but it also gave us TLOU.

If the 32MB EDRAM leads to stuff like Skyrim PS3 in a few years, I guess I'll revisit my opinion? Until then I don't see how this is much more than just FUD for FUD's sake.

There's still 5GB of dedicated game memory in addition to the ESRAM. So unless devs are completely incompetent, we shouldn't see any awful ports like we saw on the PS3.
 
It might not have been ideal, but it also gave us TLOU.

If the 32MB EDRAM leads to stuff like Skyrim PS3 in a few years, I guess I'll revisit my opinion? Until then I don't see how this is much more than just FUD for FUD's sake.

TLOU looked great and I love the game, but there was some LoD stuff going on, like super low res textures on characters for like a second before the high res textures popped in.(not sure if I'm describing this right) One thing (and I know it's super nitpicky) that really stuck out to me was the vehicles and it seems to be a problem for Naughty Dog, but it stuck out to me on the Uncharteds as well. I mean this is a shitty picture ,but look at it.

9K8r1ch.jpg


The back of the bus is one giant polygon with a shitty low res texture. All the vehicles are like this. I think it sticks out more, because the rest of the game looks so great, but man it sucks.
 

Sulik2

Member
Seems to me like Microsoft is building this Machine for a 5 year cycle instead of the 7+ this cycle was built for. Around the time only 32MB really starts to be a bandwidth bottleneck I expect the Xbox2 to be announced.
 

twobear

sputum-flecked apoplexy
I'm going to need a source for Sony paying less for silicon. Especially when you factor in silicon + memory cost. Kinect silicon doesn't count because that's essentially an extra $100 charge.

Haven't pretty much all BOM estimates put the PS4's APU as cheaper than the X1's?

TLOU looked great and I love the game, but there was some LoD stuff going on, like super low res textures on characters for like a second before the high res textures popped in.(not sure if I'm describing this right) One thing (and I know it's super nitpicky) that really stuck out to me was the vehicles and it seems to be a problem for Naughty Dog, but it stuck out to me on the Uncharteds as well. I mean this is a shitty picture ,but look at it.

9K8r1ch.jpg


The back of the bus is one giant polygon with a shitty low res texture. All the vehicles are like this. I think it sticks out more, because the rest of the game looks so great, but man it sucks.

I'm not saying TLOU isn't without compromises, though, merely that it was achieved with the not-forward-thinking 256MB of VRAM; hence supporting my point that developers are canny at working around such deficiencies.
 

Slayer-33

Liverpool-2
Depending on how the eSRAM is managed, it’s very possible that the Xbox One could have comparable effective memory bandwidth to the PlayStation 4. If the eSRAM isn’t managed as a cache however, this all gets much more complicated.


Hmm

Seems to me like Microsoft is building this Machine for a 5 year cycle instead of the 7+ this cycle was built for. Around the time only 32MB really starts to be a bandwidth bottleneck I expect the Xbox2 to be announced.


Interesting stuffs.
 
Haven't pretty much all BOM estimates put the PS4's APU as cheaper than the X1's?

We don't even know die size. If the PS4 APU is bigger, it'll be more expensive (this assumes there's no special processing steps needed for XB1 vs PS4 APU). Yield doesn't affect that cost because they're paying for the wafer. That doesn't count R&D, but still, there it is.
 
We don't even know die size. If the PS4 APU is bigger, it'll be more expensive (this assumes there's no special processing steps needed for XB1 vs PS4 APU). Yield doesn't affect that cost because they're paying for the wafer. That doesn't count R&D, but still, there it is.

Yield does factor, because if you say hypothetically that one wafer is good for making 10 APUs with a 100% yield and your only getting 50% yield then you just doubled the cost of your APU.
 
We all know Microsoft's strategy in choosing the memory architecture for the Xbone - split into the main DDR3 and on-die ESRAM pools. This thread isnt about that. Its not about comparison with PS4 APU either. I'm talking about the size Microsoft choose for the ESRAM - 32MB.

I'll start off with a quote from Anandtech's Xbone Arch analysis;

So Anand notes that if ESRAM is used as a cache (which it is) then it will be of significant perf benefit in terms of current workloads but also notes that there is not much room for growth.

Moving onto an another Anand piece on the GT3e (Iris Pro 5200) ..

So on one hand we have a console APU that needs to be future-proofed for atleast 5 years and Microsoft chose 32MB and on the other hand we have a very miserly Intel which is very conservative with it's die-sizes and they decided to future-proof theirs with 128MB.

To me, this just reads like a non-issue right now but years down the road, devs may need to play the optimize game more and more with the 32MB of ESRAM.

Well, eDRAM and ESRAM are not the same thing. From what I know, eDRAM needs less die space and is therefore less expensive. So if MS had used eDRAM like for the Xbox 360, they could have used more than 32MB. However, MS went with ESRAM because it is more universally manufactured, while Intel has its own fabs so they don't need to worry about that.
 

artist

Banned
I think it's important to remember that the motivation for 32MB ESRAM is not about performance boost, but cost savings. Using it allows them to use relatively inexpensive DDR3 memory outside the APU, compared to more expensive GDDR5. There's a direct tradeoff between ESRAM and amount of CUs you can fit in an APU size. MS went for cost saving as opposed to performance.

The reason Intel uses EDRAM is because the PC architecture sticks them with a normal DDR3 interface at 1600 MHz. MS chose to stick themselves in that situation. They didn't have to. MS could have done a daughter die (like the Intel solution does) to avoid one giant die.
I prefaced my OP about MS' strategy going with the ESRAM. The point I was trying to make was the decision to go with 32MB and not something more like 40MB, especially given the design had to be future-proof and other vendors who are even more conservative with their die-sizes have gone for a bigger size in a market that does not require the design to be as future-proof.
 
Intel claims that it would take a 100 - 130GB/s GDDR memory interface to deliver similar effective performance to Crystalwell since the latter is a cache.
interesting. and their eDRAM's peak bandwidth is 50GB/s versus 109 GB/s for the eSRAM on Xbox One.
 
Yield does factor, because if you say hypothetically that one wafer is good for making 10 APUs with a 100% yield and your only getting 50% yield then you just doubled the cost of your APU.

Show me a BOM estimate that says that. I don't think you're necessarily wrong, but I've never seen yield mentioned in a BOM estimate, so I assume they're simply billing the wafer cost plus die per wafer. Still, I expect the yield of these parts to be somewhat similar since they're identical in a lot of ways.
 

omonimo

Banned
The combination of the two plus the CPU-GPU connection at 30GB/s is how Microsoft arrives at its 200GB/s bandwidth figure, although in reality that’s not how any of this works.
This phrase is... so full of contradictions. 'is how microsoft arrives at 200 GB/s but in reality it's not how any of this works"
O_O
So xbone arrives at it's 200 GB because... it's a lie?
 
Top Bottom