tinfoilhatman
all of my posts are my avatar
My understanding is the 128MB memory in haswell is a system managed general cache - whereas the 32MB in xb1 is a user managed scratch pad.
If this is the case, the implication of this is subtle but important. For example, if you wanted to use the 32MB as a temporary location for render target data (pixel storage, effectively) then you are going to potentially run out of space - just like the 10MB edram buffer in the 360 limited resolution.
Hypothetical examples for a 1920x1080 frame buffer:
forward rendered FP16 HDR with 2x MSAA:
(8b colour + 4b depth/stencil) x 2 x 1920 x 1080 = 47MB
typical (eg frostbite) deferred renderer g-buffer, 4 MRTs each at 32bpp (no MSAA):
(4b mrt x 4 + 4b depth/stencil) x 1920 x 1080 = 39MB
This doesn't necessarily mean these cases are impossible - you could render the scene in tiles or leave some buffers in DDR - but it does add a significant layer of complexity (it won't 'just work' efficiently and automatically like the haswell cache).
The other concern I have is that it doesn't mitigate the need to copy data in/out of ESRAM - which still will be limited by DDR bandwidth. So using ESRAM will only make sense in cases where you are reading/writing the memory a large number of times within the frame - *and* those reads are often missing the on-chip caches (which in a well designed renderer isn't as common as you'd think).
Does any of this take into account, tiling, deferred rendering\virtual texturing?
This seems to be a MAJOR factor with the X1 hardware design and custom stripped down X1 DirectX version.