• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Rumor: Wii U final specs

Oblivion

Fetishing muscular manly men in skintight hosery
To compare launch consoles, the GameCube was cheaper to manufacture than the original Xbox despite being much smaller. I don't think you can say there's any correlation as far as consoles go.

That's true..but let me ask my question like this.

Both the 360 and PS3 were huge systems when they came out. I'd imagine that MS and Sony didn't want them to be so big and clunky if they could have avoided it, right? If they could have made them smaller at launch, they probably would have, right? So I would then assume they didn't cause doing that would make their systems more costly. Follow me?
 

TAJ

Darkness cannot drive out darkness; only light can do that. Hate cannot drive out hate; only love can do that.
As for the Wii U, small and lower power probably increases reliability.

All other things being equal, a smaller enclosure reduces reliability. Fact.
 
Well yeah, but that's after several revisions going on after several years.

I'm referring to new consoles at launch.

Small doesn't necessarily make it more expensive, especially with the Wii U going for "older" hardware scaled down a bit.

Smaller does however mean it's more energy efficient (the further currents have to go, the more resistance they meet and the more power you have to overcome that which in turn also means more heat). Which in turn means it's cheaper in the long run on the consumer... but most people don't look at their energy bills and go "Wow, it saved me $10 bucks a year over the PS3, and 10 bucks over 5 years is enough for a whole game... almost!"
 
I get where you're going. A laptop is more expensive the than equivalent power desktop.

I don't think any consoles have ever gotten so small that this has entered the equation in any significant way however. Perhaps the Xbox 360 could have shipped with a larger hard drive at launch if they'd used a 3.5" drive instead of 2.5". But the size of the launch 360 wasn't because it was cheaper to let it be bigger, it was because the technology they used just wasn't possible to put in a smaller package at the time.

PC to laptop minituarization is completely different from what you have to deal with in consoles. In PCs you have to use a very specific type of CPU (x86-64) if you want to compete. These processors were never originally designed to be super power efficient to begin with, and since they have to be very generalized processors instead of specialized, they often have to be super high clocked to handle all the tasks people do on them.

Now essentially, what you put in a desktop is functionally the same hardware as a laptop. There are a few exceptions (especially on the super low end)... but if you want a high end laptop, you essentially have to shove desktop parts into it. Now to make sure those desktop parts don't overheat the small case they have to be individually tested to make sure they meet lower voltage requirements and you have to have stronger coolers because you can't as easily move air in them. That means (especially on the high end) parts have to be individually checked and binned to be able to handle high speeds at low voltages and thus a much higher price.

Still, the CPUs are more or less identical. Same fabrication, same assemblies, same features. In consoles you aren't trying to shove a very generalized PC processor into a tiny console, instead you're shoving very specialized hardware... This allows you to get away with "weaker" parts that are generally cheaper/smaller and still get the same or better results than something with much greater paper specs.
 
So, back on topic: Does the I/O reside on the GPU? It's somewhere on the MCM, but the GPU is the only logical place where you could sit a north bridge. Assuming 40nm, here's what we know

Die Sizes

RV770: @55nm - 256mm^2, @40 roughly 186mm^2
RV740: @40nm - 137mm^2
RV730: @55nm - 146mm^2, @40 ~106mm^2

Wii u: ..@40nm - 156.21mm2

Let's hope the GPU is using 28nm.
 

Oblivion

Fetishing muscular manly men in skintight hosery
In the case of Wii U, miniturization includes the combining of what would otherwise be separate chips into customized packages. That decreases the size and increases the cost. They're basically brand new chips, even if they existed in other forms before.

But if you were talking about taking a chip from 65nm to 40nm there would be an increase in yield equaling a decrease in price. and having less chips on motherboard does also save the cost in other ways.

I'm open to correction, but yeah, you can't really say smaller size = larger cost. Its how you go about miniturization.

Small doesn't necessarily make it more expensive, especially with the Wii U going for "older" hardware scaled down a bit.

Smaller does however mean it's more energy efficient (the further currents have to go, the more resistance they meet and the more power you have to overcome that which in turn also means more heat). Which in turn means it's cheaper in the long run on the consumer... but most people don't look at their energy bills and go "Wow, it saved me $10 bucks a year over the PS3, and 10 bucks over 5 years is enough for a whole game... almost!"

I get where you're going. A laptop is more expensive the than equivalent power desktop.

I don't think any consoles have ever gotten so small that this has entered the equation in any significant way however. Perhaps the Xbox 360 could have shipped with a larger hard drive at launch if they'd used a 3.5" drive instead of 2.5". But the size of the launch 360 wasn't because it was cheaper to let it be bigger, it was because the technology they used just wasn't possible to put in a smaller package at the time.

Cool, makes sense. Thanks, guys.
 

Reiko

Banned
Not confirmed... I said nothing about what the WiiU footage could be because I don't have access to see for myself.

edit:

What's odd is that they would re-render the footage (presumably for WiiU costumes) and use lower settings. It would also be odd for them to render real-time given the main impetus for using pre-rendered in the game was so that you'd not have rendering bugs/streaming issues for key story scenes (and/or to hide load times of the next section).

Oh well. Jumped the gun.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
I can't remember anyone talking about the RAM possibly being slower than 360/PS3 though, but in no way do I remember everything that was said.
Here you are.

Apropos, does anybody have a clue why Anand insists the DDR3 clock is 800MHz. Because I personally don't.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Because that's the speed rating for Hynix gDDR3 chips marked "12C". 900MHz would be "11c".

I'm still not following. Let me quote what Hynix say on their package marking:

MM_H5TQ4G4(8_6)MFR.jpg
 

Donnie

Member
Well DDR3 1600 actually runs at 800 Mhz. That's where I believe we've settled. As for GFLOPs, if they even matter still, for 320 shaders, the clocks I proposed get you about 340 GFLOPS.

Considering the die size I think 480 SP's fits better than 320. Its about 15% bigger than a HD4770 which had 640 SP's. Of course we have to take eDRAM into account but that won't be anything close to half the chip size. If we remove a third for eDRAM then 480 SP's seems reasonable.
 

disap.ed

Member
Really depends on which process node the GPU is manufactured. If it's 40nm it's unlikely to do more than 300-350 GFLOPS if you keep in mind the 32MB eDRAM takes up alot of die space.
If it's on 28nm the clock could also be on the 800 MHz of the main RAM without drawing too much power and even with only 320 shader units it would be in the range of 500 GFLOPS. This would be in the range of a current Trinity, so not too shabby IMO.
But after the disappointing RAM specs I really doubt that they are on a 28nm process, so 350-400 GFLOPS max is what I expect.
 
Considering the die size I think 480 SP's fits better than 320. Its about 15% bigger than a HD4770 which had 640 SP's. Of course we have to take eDRAM into account but that won't be anything close to half the chip size. If we remove a third for eDRAM then 480 SP's seems reasonable.
*Misread* but ill leave die sizes

Die Sizes

RV770: @55nm - 256mm^2, @40 roughly 186mm^2
RV740: @40nm - 137mm^2
RV730: @55nm - 146mm^2, @40 ~106mm^2

Wii u: ..@40nm - 156.21mm2

Really depends on which process node the GPU is manufactured. If it's 40nm it's unlikely to do more than 300-350 GFLOPS if you keep in mind the 32MB eDRAM takes up alot of die space.
If it's on 28nm the clock could also be on the 800 MHz of the main RAM without drawing too much power and even with only 320 shader units it would be in the range of 500 GFLOPS. This would be in the range of a current Trinity, so not too shabby IMO.
But after the disappointing RAM specs I really doubt that they are on a 28nm process, so 350-400 GFLOPS max is what I expect.


If it reaches over 350 gflop, we would have to look at a new architecture, because the R700 line seems to cap at 12GFLOP/W (RV740), and the GPU has to be quite a bit smaller. Remember we're dealing with GPU TDP maxes of 25-30W
 

McHuj

Member
Based on the RAM speeds and sizes, I think a 28nm GPU would be absolutely overkill. At this point, anything in the 240-320 shader count guess seems reasonable.
 

Donnie

Member
Really depends on which process node the GPU is manufactured. If it's 40nm it's unlikely to do more than 300-350 GFLOPS if you keep in mind the 32MB eDRAM takes up alot of die space.
If it's on 28nm the clock could also be on the 800 MHz of the main RAM without drawing too much power and even with only 320 shader units it would be in the range of 500 GFLOPS. This would be in the range of a current Trinity, so not too shabby IMO.
But after the disappointing RAM specs I really doubt that they are on a 28nm process, so 350-400 GFLOPS max is what I expect.

Unsure of why people are assuming 320 SP's considering the die size?

Based on the RAM speeds and sizes, I think a 28nm GPU would be absolutely overkill. At this point, anything in the 240-320 shader count guess seems reasonable.

Its a 156mm2 chip so if its 40nm even with a third of the chip taken by eDRAM (which is probably overkill) that still leaves plenty of room for much more than 320 SP's. The clock speed is the real question mark for me at the moment, 800Mhz RAM would suggest 400Mhz GPU to me. But then Matt did say 600Mhz was a little to high and 400Mhz doesn't fit that description, so who knows.
 

Durante

Member
Is the consensus that Wii U is still a sizeable jump over current gen?
Depending on the definition of "sizeable", that never was the consensus. Consensus is not made by a few people screaming really loudly for months.

Though it is interesting to note that the discussion a year ago was centered on how much PS4/720 ports would need to be cut down for Wii U, while now it's mostly about whether PS360 ports should be on par or not.
 
Is the consensus that Wii U is still a sizeable jump over current gen?

Our expectations are lowering by the hour.

Unsure of why people are assuming 320 SP's considering the die size?



Its a 156mm2 chip so if its 40nm even with a third of the chip taken by eDRAM (which is probably overkill) that still leaves plenty of room for much more than 320 SP's. The clock speed is the real question mark for me at the moment, 800Mhz RAM would suggest 400Mhz GPU to me. But then Matt did say 600Mhz was a little to high and 400Mhz doesn't fit that description, so who knows.

eDRAM isnt the only thing moved on the GPU die IIRC. SP's arent exactly small either.
 

Donnie

Member
Depending on the definition of "sizeable", that never was the consensus. Consensus is not made by a few people screaming really loudly for months.

Though it is interesting to note that the discussion a year ago was centered on how much PS4/720 ports would need to be cut down for Wii U, while now it's mostly about whether PS360 ports should be on par or not.

The circle will be complete soon enough..
 

Donnie

Member
Our expectations are lowering by the hour.



eDRAM isnt the only thing moved on the GPU die IIRC. SP's arent exactly small either.

I realise the system has a DSP and an ARM chip, but there is a small third die on the MCM package which could be the ARM chip, either way its going to take up a tiny amount of space.

Also I'm not talking about the size of SP's, I'm looking at the die size of the chip vs other 40nm chips and how many SP's they have. You have the HD4770 which is 137mm2 and had 640 SP's, while GPU7 is 156mm2. Which has me struggling to see how 320 SP's is being talked about as the max possible here by some people. Does anyone have a reasonable estimate for the speed taken up by 32MB of eDRAM on a 40nm chip?
 
If ram was so slow or so problematic... why developpers doesn't complain about this ? I mean we heard complaints about CPU, but praise about GPU and RAM.
Are we sure about clockspeeds (I mean it's still possible to overclock ram ?) or bandwith being problematic ?
 

Donnie

Member
If ram was so slow or so problematic... why developpers doesn't complain about this ? I mean we heard complaints about CPU, but praise about GPU and RAM.
Are we sure about clockspeeds (I mean it's still possible to overclock ram ?) or bandwith being problematic ?

Yeah we've had devs talking about being able to use higher res textures on WiiU compared to current gen and nobody at all has complained about RAM bandwidth. Seems quite odd, this console is full of mysteries :)
 

McHuj

Member
Could you be more specific?

Also what does size of SP's have to do with it?, I'm looking at the die size vs other 40nm chips and how many SP's they have. You have the HD4770 which is 137mm2 and had 640 SP's, while GPU7 is 156mm2. Which has me struggling to see how 320 SP's is being talked about as the max possible here by some people. Does anyone have a reasonable estimate for the speed taken up by 32MB of eDRAM on a 40nm chip?

They could fit more than 320 SP's in that space, but I think (and this is my opinion) that more than 320 is a waste.
 
I realise the system has a DSP and an ARM chip, but there is a small third die on the MCM package which could be the ARM chip, either way its going to take up a tiny amount of space.

Also I'm not talking about the size of SP's, I'm looking at the die size of the chip vs other 40nm chips and how many SP's they have. You have the HD4770 which is 137mm2 and had 640 SP's, while GPU7 is 156mm2. Which has me struggling to see how 320 SP's is being talked about as the max possible here by some people. Does anyone have a reasonable estimate for the speed taken up by 32MB of eDRAM on a 40nm chip?

I'm utterly confused. I was saying the NB/SB, eDRAM, and god knows what else have all been moved onto the die. That eats up a pretty sizeable amount of space. You have to split up the 157mm^2 much more carefully because of this.
 
I realise the system has a DSP and an ARM chip, but there is a small third die on the MCM package which could be the ARM chip, either way its going to take up a tiny amount of space.

Also I'm not talking about the size of SP's, I'm looking at the die size of the chip vs other 40nm chips and how many SP's they have. You have the HD4770 which is 137mm2 and had 640 SP's, while GPU7 is 156mm2. Which has me struggling to see how 320 SP's is being talked about as the max possible here by some people. Does anyone have a reasonable estimate for the speed taken up by 32MB of eDRAM on a 40nm chip?

The radeon Mobility 4830 has 640 and that's 136.89mm2. What happened to considering laptop GPUs?
 

Donnie

Member
I'm utterly confused. I was saying the NB/SB, eDRAM, and god knows what else have all been moved onto the die. That eats up a pretty sizeable amount of space. You have to split up the 157mm^2 much more carefully because of this.

Not sure why you're confused, I already mentioned eDRAM in the post you first replied to, then I mentioned other things that could possibly be on the die. I'm taking into account that the chip is more than just GPU logic. If not I'd be expecting over 640 SP's (like I said 4770 has 640 SP's at 137mm2 on 40nm). Obviously I'm open to any info people might have on what kind of space 32MB eDRAM may take up on a 40nm die. I was thinking 50mm2 at the absolute most (??).
 

User Tron

Member
Stupid question: could it be that some SPs are reserved for GPGPU? Like two independent sets. One part being the simd unit for the cpu, the other part "classical" shaders.
 

McHuj

Member
Can you explain your reasoning?, more shader units can hardly be considered a waste of space IMO.

My reasoning is based on bandwidth and size of the EDRAM. I don't think there's enough bandwidth to feed that many shaders's from memory, nor is the EDRAM sufficiently big enough to hold all the necessary data for all the shaders. I think they're better off spending the silicon space on additional ROPs or additional logic similar to what was done in the Xenos daughter die.

Xenos was ~240 shaders, 16 texture units, 8 rops, 10MB edram. If the WiiU went to 320 shaders, 32 texture units and 16 rops with 32MB edram, I think it would be a nice upgrade. And anything more would need more edram/faster main memory.
 

Donnie

Member
My reasoning is based on bandwidth and size of the EDRAM. I don't think there's enough bandwidth to feed that many shaders's from memory, nor is the EDRAM sufficiently big enough to hold all the necessary data for all the shaders. I think they're better off spending the silicon space on additional ROPs or additional logic similar to what was done in the Xenos daughter die.

Xenos was ~240 shaders, 16 texture units, 8 rops, 10MB edram. If the WiiU went to 320 shaders, 32 texture units and 16 rops with 32MB edram, I think it would be a nice upgrade. And anything more would need more edram/faster main memory.

Keeping shader units fed has little to do with external bandwidth if you have a 32MB high bandwidth low latency buffer to render into. Because you don't necessarily need to store a large amount of data to keep them fed. Over 320 SP's would be far from a waste IMO.
 

Donnie

Member
Stupid question: could it be that some SPs are reserved for GPGPU? Like two independent sets. One part being the simd unit for the cpu, the other part "classical" shaders.

I don't see any reason to reserve shaders for any particular task. The point of a unified shader architecture is to have one set of shader units that can all perform multiple tasks. That way when you aren't using many for one task you can use more for another, they can be shared out depending on the situation so none have to be left idle.
 
Top Bottom