• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Xbox One APU - Die Photo - Courtesy of Chipworks

This isn't worth a new thread but it seems "Joel Hruska" at extremetech is making esram latency performance claims. Wasn't esram latency dismissed as irrelevant for GPU performance months ago?

"Microsoft invested more silicon in large, low latency caches, while Sony sank more money into raw bandwidth. As far as performance is concerned, this could well end up a tie; as the Xbox One should be able to access data more quickly, while the PS4 can stream sustained data far more effectively."
 

onQ123

Member
This isn't worth a new thread but it seems "Joel Hruska" at extremetech is making esram latency performance claims. Wasn't esram latency dismissed as irrelevant for GPU performance months ago?

"Microsoft invested more silicon in large, low latency caches, while Sony sank more money into raw bandwidth. As far as performance is concerned, this could well end up a tie; as the Xbox One should be able to access data more quickly, while the PS4 can stream sustained data far more effectively."



i2V5XTGS6RZNJ.gif
 

Insane Metal

Gold Member
This isn't worth a new thread but it seems "Joel Hruska" at extremetech is making esram latency performance claims. Wasn't esram latency dismissed as irrelevant for GPU performance months ago?

"Microsoft invested more silicon in large, low latency caches, while Sony sank more money into raw bandwidth. As far as performance is concerned, this could well end up a tie; as the Xbox One should be able to access data more quickly, while the PS4 can stream sustained data far more effectively."

Hum... about this...

There are some interesting differences to explore. First, consider the Xbox One’s Jaguar CPU blocks. Like the PS4, it has two quad-core chips — but the Xbox One has a bit of circuitry hanging off the CPU that the PS4 lacks. Here’s a comparison of the Xbox One and PS4 CPU islands. We had to rotate the blocks to line them up identically, which is why the label is reversed. See the block in red? The PS4 doesn’t seem to have an equivalent. What it actually does is unclear. It’s a bit large to be the built-in audio or the IOMMU that HSA theoretically requires. There’s nothing analogous on any of the Kabini floor plans we’ve ever seen.

Aren't those the MOVE engines? The APU is supposed to have 4, 2 on each CPU...?

Anyway, that conclusion seems off... forget all the things the PS4 has advantage over the XBone and it could end up in a wash? eSRAM is actually good? I mean, we know it's there to help the system with its low bandwidth, but we also know it is a bit of a bottleneck.
 

Raist

Banned
Hum... about this...



Aren't those the MOVE engines? The APU is supposed to have 4, 2 on each CPU...?

Anyway, that conclusion seems off... forget all the things the PS4 has advantage over the XBone and it could end up in a wash? eSRAM is actually good? I mean, we know it's there to help the system with its low bandwidth, but we also know it is a bit of a bottleneck.

This is silly, but I just chuckled when I saw the comparison of the XB1 vs PS4 jaguar cores.
they picked the one at the bottom and rotated it 180˚, while they could have just picked the one at the top and wouldn't have needed to rotate it
 

Mr Moose

Member
This is silly, but I just chuckled when I saw the comparison of the XB1 vs PS4 jaguar cores.
they picked the one at the bottom and rotated it 180˚, while they could have just picked the one at the top and wouldn't have needed to rotate it

LOL
, it's actually worse, they wasted time flipping the image (mirroring kinda thing or whatever it's called).
 

tipoo

Banned
This isn't worth a new thread but it seems "Joel Hruska" at extremetech is making esram latency performance claims. Wasn't esram latency dismissed as irrelevant for GPU performance months ago?

"Microsoft invested more silicon in large, low latency caches, while Sony sank more money into raw bandwidth. As far as performance is concerned, this could well end up a tie; as the Xbox One should be able to access data more quickly, while the PS4 can stream sustained data far more effectively."

GPUs aren't latency sensitive, if they were DDR would be used on them instead of GDDR5. All the short timings in the world won't help them more than raw bandwidth.

On the CPU side, I can believe the site a little bit more, but if you factor in the difference in memory clock speed, the latency in absolute time is not even as much different as one would think, if counted in cycles it's very high but in time I think it was just close to double or something. Modern CPU architectures are pretty good at hiding latency behind prefetchers and cache though, too.

Kind of funny that not a single tech blog goes as deep as Neogaf. As The Verge put it, "Deep. Neogaf. Shit. "
 
Here's Anand's take on chipworks dissection.

Anand also mentioned 16 ROP and esram width bottlenecks in an earlier article. He's probably a bit more reliable than "extremetech".

Yes MS engineers even admitted GPUs aren't latency sensitive in the Digital Foundry interview. I've read on pro-MS forums and beyond3d that MS can use 'tiled resources' to fit gigs of texture data into 32mb esram and that's somehow going to improve efficiency/performance. Sounds like unsourced and unlikely speculation to me.
 
You skipped an important point. The Haswell die you pointed to with less wasted space to your eye, is a colorized representation. Actual scans show similar empty areas littered through.

In colorized shots unused space is even easier to spot, since they have a rainbow pattern. Unused space is minimal at Intel designs.

Compared to what?

- $100 chip cost
- 348mm2 die size
- Total system power consumption of 140W, so APU TDP likely under 120W
- CPU performance equal to an i3-3220
- GPU performance between a 7850 and 7870

Since you called the design "shitty", could someone else have done significantly better with the same cost, size, and power constraints?

Compared to the same components in a better array. Nvidia released a new series from GTX400 to GTX500 just improving die design of Fermi GF100. They got almost a 20% boost with the same TDP and manufacturing process.

Other than than, not, CPU is worse than any i3. GPU is pretty good actually, I never said otherwise. Radeon shines at performance/die area ratio. Only since Kepler Nvidia is catching up.

There is also something you have to bear in mind. Both Nvidia and Intel can destroy that design at a technical perspective and make it cheaper to manufacture. What makes Intel and Nvidia more expensive than AMD is the royalties and fees they ask, not the silicon.
 

ShapeGSX

Member
But unused area cost money, lowers performance and raise power leaks.

Unused area does not raise power leakage. It doesn't lower performance. It does cost money. However, the overall plan of the entire chip is determined early on, with feedback from the individual teams. Sometimes portions of designs need to grow or shrink, which leads to gaps. Unused space on a chip, any chip these days, is pretty much unavoidable. It may be obvious from 10,000 feet, or it may not be. But it is there. Some unused space is actually filled in with stuff that looks like transistors, so you wouldn't be able to tell anyway.

The point is, though, you really can't tell how much unused space there is on a chip by looking at a die photo.
 

Raist

Banned
Today in Munchkin land.

Misterx: What shock at start was going to happen?
Insider: The reason I think chipworks changed the diagram in regards to the x1, just like there wiiu stuff they still cant give 100% proof to that gpgpu design tear down.. the x1 is a whole new silicon "dark silicon" hence the darker color and why they are going with what has been out in the public domain. Maybe it is an honest mistake maybe not... you cant say it is the same gpu design as ps4 when it is not. The cu's are smaller on the x1 gpu because of the transistor packaging that is used is also a new technology that is not used on ps4 architecture so more cu can be placed in the gp part. Whats even more shocking is the gpu core is also larger. Any body with two eyes can see this. The cluster of esram on the gpu is 47mb the other 17 mb is cach for cpu/audio dsp. Like I have said 64mb of sram but 47mb is what ms refer to as on chip storage. .. see the sram cach between the cpu clusters try adding that up cos that is the app gpu .it can read and write into the cpu cach directly.

Misterx: So, no 2.5d then? Gpu cores are larger and chipworks show 14 of them, not 18. Where are 4 more?
Insider: There are 4 cu for cpu/ app gpu older cu's And then there is 14 for gpu.. there is only 12 avalible but two are and can be unlocked and will be unlocked. Iv told you the gpu is based on 280x and you can see the gpu core is larger .. then the 7850 core in the ps4.... I have not lied to any body really disappointed in some peoples posts but I believe in free speech I hope they can say the same about them selves . there is definitely 2.5d stacking in xbox one architecture I have seen the rnd xrays. But I have been told that the soc also uses advanced transistor packing that can only be applied to dark silicon. I am not a 2.5d 3d micro processor architecture designer.. I understand 89% of the logic tho. They are using a wired approach not the glue type that 3d w2w use.. and I stand by my word there is major errors in the chipworks rulings.. and any body who takes the time will see this.. ...

I can't stop laughing.
 
Look, chip design houses have full teams devoted to improve dies arrangement and layout. Doing it right is expensive, and some times it isn't profitable at all. In this case, both chips will sell millions, but AMD isn't fabbing them, so they don't care about die size as much as Intel, or mobile parts vendors, can do.

Chip layout is important to not screw timmings and achieve better speeds getting connections as close as you can, reduced die size is an added benefit. Any engineer that have a look at this chip will tell you that the SOC designer wasn't the designer of any of the parts. It is pretty obvious that they just have put some parts together instead of a monolithic design.

That's all, no deal breaker, no preorder cancelled. Layout is ugly, they saved money by not polishing (yet) the final design. Just that.
 

Argyle

Member
Look, chip design houses have full teams devoted to improve dies arrangement and layout. Doing it right is expensive, and some times it isn't profitable at all. In this case, both chips will sell millions, but AMD isn't fabbing them, so they don't care about die size as much as Intel, or mobile parts vendors, can do.

Are you sure?

Based on the comments made by Lisa Su and the fact that the only viable game consoles scheduled to be unveiled this year are Microsoft Xbox Next, Sony PlayStation 4 “Orbis”, it looks like AMD will not license its graphics and processing technologies to platform owners, but will sell semi-custom accelerated processing units to them.

Source:
http://www.xbitlabs.com/news/multim...s_to_Account_for_20_of_Revenue_This_Year.html
 

kitch9

Banned
Look, chip design houses have full teams devoted to improve dies arrangement and layout. Doing it right is expensive, and some times it isn't profitable at all. In this case, both chips will sell millions, but AMD isn't fabbing them, so they don't care about die size as much as Intel, or mobile parts vendors, can do.

Chip layout is important to not screw timmings and achieve better speeds getting connections as close as you can, reduced die size is an added benefit. Any engineer that have a look at this chip will tell you that the SOC designer wasn't the designer of any of the parts. It is pretty obvious that they just have put some parts together instead of a monolithic design.

That's all, no deal breaker, no preorder cancelled. Layout is ugly, they saved money by not polishing (yet) the final design. Just that.
Jesus dude stop with the bollocks.
 

Perkel

Banned
There is also something you have to bear in mind. Both Nvidia and Intel can destroy that design at a technical perspective and make it cheaper to manufacture. What makes Intel and Nvidia more expensive than AMD is the royalties and fees they ask, not the silicon.

In case you didn't catch it yet. They can't. Both Nvidia and Intel don't have good APUs. Unless they have good APUs they CAN'T beat AMD in price to power ratio. Because using Intel or Nvidia would mean using two separate chips instead of one which in case you don't know increase a lot final price of hardware.
 
In case you didn't catch it yet. They can't. Both Nvidia and Intel don't have good APUs. Unless they have good APUs they CAN'T beat AMD in price to power ratio. Because using Intel or Nvidia would mean using two separate chips instead of one which in case you don't know increase a lot final price of hardware.

AMD didn't had any APU as powerful as PS4, so they did a custom design using their existing tech. Nothing prevents Intel or Nvidia of doing same. No one can match Intel at performance/watt.

Also, 360 slim had two different vendor chips fused into a single DIE, IBM and ATI. With the right licenses, there isn't any problem into buying different parts and manufacture together.

Thing is none of them were willing to do that at the price they were offered. Again, it isn't about tech, it is about politics. Sony and MS took best deal over best technology. I think I made my point clear enough.
 

SRG01

Member
I've read on pro-MS forums and beyond3d that MS can use 'tiled resources' to fit gigs of texture data into 32mb esram and that's somehow going to improve efficiency/performance. Sounds like unsourced and unlikely speculation to me.

That actually wouldn't work. Getting data in and out of eSRAM may be fast, but there's no direct pipe from the eSRAM to the larger, slower DDR3 memory pool, so that transfer speed will always be the bottleneck.

As I've remarked in previous threads, that bandwidth would be useful for fast framebuffer operations -- in this case, tiled framebuffer operations -- like the 360, but otherwise it's too limited to do anything worthwhile.
 

Argyle

Member
Yes. I'm pretty sure AMD is fabless. Not exclusive insider info.

Sure, no one is arguing that, that's common knowledge.

Tell me this though, do you really believe that AMD "don't care about die size" because they "aren't fabbing" any of their CPUs/APUs/GPUs?

Since they are selling finished APUs to MS and Sony (I believe this is because of the x86 cross licensing agreement with Intel - they can't sublicense the design for someone else to manufacture) any yield problems or wasted die space will eat directly into their margins.
 
Tell me this though, do you really believe that AMD "don't care about die size" because they "aren't fabbing" any of their CPUs/APUs/GPUs?

AMD is doing bad lately at DIE size because they don't have the resources. They aren't prioritizing DIE layout since it isn't that important.

As I pointed earlier, that is very expensive and with dubious benefit. I'm sure next node jumps for these APUs will see improvements in that regard, though.

Since they are selling finished APUs to MS and Sony (I believe this is because of the x86 cross licensing agreement with Intel - they can't sublicense the design for someone else to manufacture) any yield problems or wasted die space will eat directly into their margins.

It is TSMC the company that handles that. AMD relationship with TSMC and Global Foundries is something too complex to be discussed in a thread dedicated to count billions of transistors and differentiate shades of silicon color.

"dark silicon"

LMAO!!....the best part is there actually is such a thing...unfortunately for misterX it has nothing to do with the actual color of the chips

darksilicont9yvm.png
 
Here's Anand's take on chipworks dissection.

Anand also mentioned 16 ROP and esram width bottlenecks in an earlier article. He's probably a bit more reliable than "extremetech".

Yes MS engineers even admitted GPUs aren't latency sensitive in the Digital Foundry interview. I've read on pro-MS forums and beyond3d that MS can use 'tiled resources' to fit gigs of texture data into 32mb esram and that's somehow going to improve efficiency/performance. Sounds like unsourced and unlikely speculation to me.

Hmm I found this part of the article very interesting...

In order to accommodate the eSRAM on die Microsoft not only had to move to a 12 CU GPU configuration, but it’s also only down to 16 ROPs (half of that of the PS4). The ROPs (render outputs/raster operations pipes) are responsible for final pixel output, and at the resolutions these consoles are targeting having 16 ROPs definitely puts the Xbox One as the odd man out in comparison to PC GPUs. Typically AMD’s GPU targeting 1080p come with 32 ROPs, which is where the PS4 is, but the Xbox One ships with half that. The difference in raw shader performance (12 CUs vs 18 CUs) can definitely creep up in games that run more complex lighting routines and other long shader programs on each pixel, but all of the more recent reports of resolution differences between Xbox One and PS4 games at launch are likely the result of being ROP bound on the One. This is probably why Microsoft claimed it saw a bigger increase in realized performance from increasing the GPU clock from 800MHz to 853MHz vs. adding two extra CUs. The ROPs operate at GPU clock, so an increase in GPU clock in a ROP bound scenario would increase performance more than adding more compute hardware.

I remember a while back a lot of people were making the statement that 32 ROPs were not necessary at 1080p and it was a bit overkill for these consoles. This statement seems to contradict that.
 
"dark silicon"

LMAO!!....the best part is there actually is such a thing...unfortunately for misterX it has nothing to do with the actual color of the chips

So gud.....
So darker silicon is better silicon?

Hmm I found this part of the article very interesting...

In order to accommodate the eSRAM on die Microsoft not only had to move to a 12 CU GPU configuration, but it’s also only down to 16 ROPs (half of that of the PS4). The ROPs (render outputs/raster operations pipes) are responsible for final pixel output, and at the resolutions these consoles are targeting having 16 ROPs definitely puts the Xbox One as the odd man out in comparison to PC GPUs. Typically AMD’s GPU targeting 1080p come with 32 ROPs, which is where the PS4 is, but the Xbox One ships with half that. The difference in raw shader performance (12 CUs vs 18 CUs) can definitely creep up in games that run more complex lighting routines and other long shader programs on each pixel, but all of the more recent reports of resolution differences between Xbox One and PS4 games at launch are likely the result of being ROP bound on the One. This is probably why Microsoft claimed it saw a bigger increase in realized performance from increasing the GPU clock from 800MHz to 853MHz vs. adding two extra CUs. The ROPs operate at GPU clock, so an increase in GPU clock in a ROP bound scenario would increase performance more than adding more compute hardware.

I remember a while back a lot of people were making the statement that 32 ROPs were not necessary at 1080p and it was a bit overkill for these consoles. This statement seems to contradict that.

I think its because the ps4 while it has fast ram its not enough too 100% feed 32 ROPS but going from 16 to 32 ROPS is probably more an design and manufacturing choice.
But it will definitely help because of the higher bandwidth in the ps4 you can probably feed 16 + X ROPS for 100% instead of only maybe 10~16 ROPS 100% on X1.
But im not sure about that claim. Pretty sure someone can do the calculations if he wants to or post the formula to calculate the needed bandwidth to feed the rops.
 

joecanada

Member
Lol, that guys blog is so entertaining.

If you analyze the writing its obvious that mister X wrote both his and the "insider" parts and in his own words as the style is identical...
So... either he taped an interview but then paraphrased it in his own shitty English ( which makes no sense), or..... would you like to guess the more logical conclusion?
 
X1 games should be using SMAA like Ryse definitely in case they can't use MSAA or some demanding AA.

Hope to see some papers soon of crytek about their custom AA solution should be interesting. Because sharing is caring and it attracts a lot of techies to your studio.
 

Timu

Member
Hope to see some papers soon of crytek about their custom AA solution should be interesting. Because sharing is caring and it attracts a lot of techies to your studio.
I also hope they share their upscaler as well, better than what MS used for the X1 for sub 1080p games.=p
 

kitch9

Banned
Yes. I'm pretty sure AMD is fabless. Not exclusive insider info.



If you don't like to read some disccussion about dies, why do you enter into a thread that talks about dies?

I am interested in talking about dies, I am however completely disinterested to read useless information such as:

"Space LOL!"
"Leakage LOL"
"Automated designs LOL AMD!"
"Intel are awesome LOL, look at the lack of space, so awesome LOL!"

Seriously wat?
 

RoadHazard

Gold Member

Half of GAF payed off by Sony? Well, then where the hell is my money?! I will admit to being more positive to the PS4 than the XBO, so shouldn't I be on this payroll? Will I receive it as a check, a wire transfer, or what? How much will I get? Is it a flat rate, or a certain amount for each positive word written about the PS4?

Please get back to me on this, MisterX, as you seem to know what's going on.
 
Top Bottom