SmoothRunningGun
Member
Resolution is alot lower in portable form (I assume) but still it's a pretty weak system.
Nice to see the "only gameplay matters" crowd is still around to downplay issues on low power Nintendo hardware.
lol @ all the "Give me PS4+ performance in a handheld for under $200 or I'm not buying!"
I expected a new home console which could become portable, not a portable console which could become stationary. If I understand the specs correctly it's not even close to PS4 and XB1, and this is 3-4 years after those consoles which are now getting upgraded. Switch comes out the same year as 6TF Xbox Scorpio . Not good, even for conservative Nintendo standards.I'm so OK with this, dunno why people are freaking out. That is a beast of a handheld if it they are getting 5-8 hours of battery playing full Wii U+ games. An iPhone 7 can barely muster 4 hours of Clash of Clans.
It doesn't mean anything though. Nintendo Switch and Nvidia Shield are two very different machines, and I'm pretty sure Switch will be a much better gaming system.
Yet, our point is that marketing it as a home console makes no sense since it'll be a weak home console when it's a powerful handheld.
That point remains to be seen. We only know about clockspeeds. Could be slower indeed. But then again, it's in a far smaller form factor. That's basically what happens when you put a product in a smaller form factor. Shield TV is far bigger than Switch. The same way PS4 Pro is weaker than RX 480 which it's based on.
What I was implying is that every decisions they make is an assumed one, and that they're not just fiddling with some tech they know nothing about, in hope that somehow it works to get them third-party games.
If they made the Switch's specs as they are, they deliberately knew it wasn't going to appeal to huge western third-party. They also know, like everyone else, why the Wii U failed to attract third-party. They knew that, if they just made a huge 400$ box they would have something powerful enough for the third party. Yet they didn't. That's not because they "don't learn".
you can apply that logic to every game, on every console
I haven't had time to read through every response here, so I'm probably repeating what others have already said, but here are my thoughts on the matter, anyway:
CPU Clock
This isn't really surprising, given (as predicted) CPU clocks stay the same between portable and docked mode to make sure games don't suddenly become CPU limited when running in portable mode.
The overall performance really depends on the core configuration. An octo-core A72 setup at 1GHz would be pretty damn close to PS4's 1.6GHZ 8-core Jaguar CPU. I don't necessarily expect that, but a 4x A72 + 4x A53 @ 1GHz should certainly be able to provide "good enough" performance for ports, and wouldn't be at all unreasonable to expect.
Memory Clock
This is also pretty much as expected as 1.6GHz is pretty much the standard LPDDR4 clock speed (which I guess confirms LPDDR4, not that there was a huge amount of doubt). Clocking down in portable mode is sensible, as lower resolution means smaller framebuffers means less bandwidth needed, so they can squeeze out a bit of extra battery life by cutting it down.
Again, though, the clock speed is only one factor. There are two other things that can come into play here. The second factor, obviously enough, is the bus width of the memory. Basically, you're either looking at a 64 bit bus, for 25.6GB/s, or a 128 bit bus, for 51.2GB/s of bandwidth. The third is any embedded memory pools or cache that are on-die with the CPU and GPU. Nintendo hasn't shied away from large embedded memory pools or cache before (just look at the Wii U's CPU, its GPU, the 3DS SoC, the n3DS SoC, etc., etc.), so it would be quite out of character for them to avoid such customisations this time around. Nvidia's GPU architectures from Maxwell onwards use tile-based rendering, which allows them to use on-die caches to reduce main memory bandwidth consumption, which ties in quite well with Nintendo's habits in this regard. Something like a 4MB L3 victim cache (similar to what Apple uses on their A-series SoCs) could potentially reduce bandwidth requirements by quite a lot, although it's extremely difficult to quantify the precise benefit.
GPU Clock
This is where things get a lot more interesting. To start off, the relationship between the two clock speeds is pretty much as expected. With a target of 1080p in docked mode and 720p in undocked mode, there's a 2.25x difference in pixels to be rendered, so a 2.5x difference in clock speeds would give developers a roughly equivalent amount of GPU performance per pixel in both modes.
Once more, though, and perhaps most importantly in this case, any interpretation of the clock speeds themselves is entirely dependent on the configuration of the GPU, namely the number of SMs (also ROPs, front-end blocks, etc, but we'll assume that they're kept in sensible ratios).
Case 1: 2 SMs - Docked: 384 GF FP32 / 768 GF FP16 - Portable: 153.6 GF FP32 / 307.2 GF FP16
I had generally been assuming that 2 SMs was the most likely configuration (as, I believe, had most people), simply on the basis of allowing for the smallest possible SoC which could meet Nintendo's performance goals. I'm not quite so sure now, for a number of reasons.
Firstly, if Nintendo were to use these clocks with a 2 SM configuration (assuming 20nm), then why bother with active cooling? The Pixel C runs a passively cooled TX1, and although people will be quick to point out that Pixel C throttles its GPU clocks while running for a prolonged time due to heat output, there are a few things to be aware of with Pixel C. Firstly, there's a quad-core A57 CPU cluster at 1.9GHz running alongside it, which on 20nm will consume a whopping 7.39W when fully clocked. Switch's CPU might be expected to only consume around 1.5W, by comparison. Secondly, although I haven't been able to find any decent analysis of Pixel C's GPU throttling, the mentions of it I have found indicate that, although it does throttle, the drop in performance is relatively small, and as it's clocked about 100MHz above Switch to begin with it may only be throttling down to a 750MHz clock or so even under prolonged workloads. There is of course the fact that Pixel C has an aluminium body to allow for easier thermal dissipation, but it likely would have been cheaper (and mechanically much simpler) for Nintendo to adopt the same approach, rather than active cooling.
Alternatively, we can think of it a different way. If Switch has active cooling, then why clock so low? Again assuming 20nm, we know that a full 1GHz clock shouldn't be a problem for active cooling, even with a very small quiet fan, given the Shield TV (which, again, uses a much more power-hungry CPU than Switch). Furthermore, if they wanted a 2.5x ratio between the two clock speeds, that would give a 400MHz clock in portable mode. We know that the TX1, with 2 SMs on 20nm, consumes 1.51W (GPU only) when clocked at about 500MHz. Even assuming that that's a favourable demo for the TX1, at 20% lower clock speed I would be surprised if a 400MHz 2 SM GPU would consume any more than 1.5W. That's obviously well within the bounds for passive cooling, but even being very conservative with battery consumption it shouldn't be an issue. The savings from going from 400MHz to 300MHz would perhaps only increase battery life by about 5-10% tops, which makes it puzzling why they'd turn down the extra performance.
Finally, the recently published Switch patent application actually explicitly talks about running the fan at a lower RPM while in portable mode, and doesn't even mention the possibility of turning it off while running in portable mode. A 2 SM 20nm Maxwell GPU at ~300MHz shouldn't require a fan at all, and although it's possible that they've changed their mind since filing the patent in June, it begs the question of why they would even consider running the fan in portable mode if their target performance was anywhere near this.
Case 2: 3 SMs - Docked: 576 GF FP32 / 1,152 GF FP16 - Portable: 230.4 GF FP32 / 460.8 GF FP16
This is a bit closer to the performance level we've been led to expect, and it does make a little bit of sense from the perspective of giving a little bit over TX1 performance at lower power consumption. (It also matches reports of overclocked TX1s in early dev kits, as you'd need to clock a bit over the standard 1GHz to reach docked performance here.) Active cooling while docked makes sense for a 3 SM GPU at 768MHz, although wouldn't be needed in portable mode. It still leaves the question of why not use 1GHz/400MHz clocks, as even with 3 SMs they should be able to get by with passive cooling at 400MHz, and battery consumption shouldn't be that much of an issue.
Case 3: 4 SMs - Docked: 768 GF FP32 / 1,536 GF FP16 - Portable: 307.2 GF FP32 / 614.4 GF FP16
This would be on the upper limit of what's been expected, performance wise, and the clock speeds start to make more sense at this point, as portable power consumption for the GPU would be around the 2W mark, so further clock increases may start to effect battery life a bit too much (not that 400-500MHz would be impossible from that point of view, though). Active cooling would be necessary in docked mode, but still shouldn't be needed in portable mode (except perhaps if they go with a beefier CPU config than expected).
Case 4: More than 4 SMs
I'd consider this pretty unlikely, but just from the point of view of "what would you have to do to actually need active cooling in portable mode at these clocks", something like 6 SMs would probably do it (1.15 TF FP32/2.3 TF FP16 docked, 460 GF FP32/920 GF FP16 portable), but I wouldn't count on that. For one, it's well beyond the performance levels that reliable-so-far journalists have told us to expect, but it would also require a much larger die than would be typical for a portable device like this (still much smaller than PS4/XBO SoCs, but that's a very different situation).
TLR
Each of these numbers are only a single variable in the equation, and we need to know things like CPU configuration, memory bus width, embedded memory pools, number of GPU SMs, etc. to actually fill out the rest of those equations to get the relevant info. Even on the worst end of the spectrum, we're still getting by far the most ambitious portable that Nintendo's ever released, which also doubles as a home console that's noticeably higher performing than Wii U, which is fine by me.
Yeah we're so close to the Switch presentation, there's no point taking anything as fact for a few weeks, its all rumors.
They're targeting it due to vaguely similar demographics, that don't really exist on Nintendo ecosystems. Unless there's a noticeably huge shift from Vita->Switch, those developers will likely continue hovering around Sony consoles.
I don't think you need a doctorate in Nintendo decisions to know that Nintendo has done this before and the outcome was extremely undesirable. The power capabilities of the Switch could result in another round of headaches and disinterest from developers, publishers, and consumers. I don't know about you, but I think Nintendo should avoid creating a sequel to the Wii U.
We're still dealing in hypotheticals and rumors, so everything we know could change at the January event. For a portable, the Switch seems like it will beat the Vita in performance but fall short of modern smartphones. For a console, it doesn't seem like it will be competitive against any of the current or future hardware releases. This could be a good thing, but looking at what happened with the Wii U, you can begin to understand why a lot of people believe this could be a bad thing. A very bad thing.
I haven't had time to read through every response here, so I'm probably repeating what others have already said, but here are my thoughts on the matter, anyway:
CPU Clock
This isn't really surprising, given (as predicted) CPU clocks stay the same between portable and docked mode to make sure games don't suddenly become CPU limited when running in portable mode.
The overall performance really depends on the core configuration. An octo-core A72 setup at 1GHz would be pretty damn close to PS4's 1.6GHZ 8-core Jaguar CPU. I don't necessarily expect that, but a 4x A72 + 4x A53 @ 1GHz should certainly be able to provide "good enough" performance for ports, and wouldn't be at all unreasonable to expect.
Memory Clock
This is also pretty much as expected as 1.6GHz is pretty much the standard LPDDR4 clock speed (which I guess confirms LPDDR4, not that there was a huge amount of doubt). Clocking down in portable mode is sensible, as lower resolution means smaller framebuffers means less bandwidth needed, so they can squeeze out a bit of extra battery life by cutting it down.
Again, though, the clock speed is only one factor. There are two other things that can come into play here. The second factor, obviously enough, is the bus width of the memory. Basically, you're either looking at a 64 bit bus, for 25.6GB/s, or a 128 bit bus, for 51.2GB/s of bandwidth. The third is any embedded memory pools or cache that are on-die with the CPU and GPU. Nintendo hasn't shied away from large embedded memory pools or cache before (just look at the Wii U's CPU, its GPU, the 3DS SoC, the n3DS SoC, etc., etc.), so it would be quite out of character for them to avoid such customisations this time around. Nvidia's GPU architectures from Maxwell onwards use tile-based rendering, which allows them to use on-die caches to reduce main memory bandwidth consumption, which ties in quite well with Nintendo's habits in this regard. Something like a 4MB L3 victim cache (similar to what Apple uses on their A-series SoCs) could potentially reduce bandwidth requirements by quite a lot, although it's extremely difficult to quantify the precise benefit.
GPU Clock
This is where things get a lot more interesting. To start off, the relationship between the two clock speeds is pretty much as expected. With a target of 1080p in docked mode and 720p in undocked mode, there's a 2.25x difference in pixels to be rendered, so a 2.5x difference in clock speeds would give developers a roughly equivalent amount of GPU performance per pixel in both modes.
Once more, though, and perhaps most importantly in this case, any interpretation of the clock speeds themselves is entirely dependent on the configuration of the GPU, namely the number of SMs (also ROPs, front-end blocks, etc, but we'll assume that they're kept in sensible ratios).
Case 1: 2 SMs - Docked: 384 GF FP32 / 768 GF FP16 - Portable: 153.6 GF FP32 / 307.2 GF FP16
I had generally been assuming that 2 SMs was the most likely configuration (as, I believe, had most people), simply on the basis of allowing for the smallest possible SoC which could meet Nintendo's performance goals. I'm not quite so sure now, for a number of reasons.
Firstly, if Nintendo were to use these clocks with a 2 SM configuration (assuming 20nm), then why bother with active cooling? The Pixel C runs a passively cooled TX1, and although people will be quick to point out that Pixel C throttles its GPU clocks while running for a prolonged time due to heat output, there are a few things to be aware of with Pixel C. Firstly, there's a quad-core A57 CPU cluster at 1.9GHz running alongside it, which on 20nm will consume a whopping 7.39W when fully clocked. Switch's CPU might be expected to only consume around 1.5W, by comparison. Secondly, although I haven't been able to find any decent analysis of Pixel C's GPU throttling, the mentions of it I have found indicate that, although it does throttle, the drop in performance is relatively small, and as it's clocked about 100MHz above Switch to begin with it may only be throttling down to a 750MHz clock or so even under prolonged workloads. There is of course the fact that Pixel C has an aluminium body to allow for easier thermal dissipation, but it likely would have been cheaper (and mechanically much simpler) for Nintendo to adopt the same approach, rather than active cooling.
Alternatively, we can think of it a different way. If Switch has active cooling, then why clock so low? Again assuming 20nm, we know that a full 1GHz clock shouldn't be a problem for active cooling, even with a very small quiet fan, given the Shield TV (which, again, uses a much more power-hungry CPU than Switch). Furthermore, if they wanted a 2.5x ratio between the two clock speeds, that would give a 400MHz clock in portable mode. We know that the TX1, with 2 SMs on 20nm, consumes 1.51W (GPU only) when clocked at about 500MHz. Even assuming that that's a favourable demo for the TX1, at 20% lower clock speed I would be surprised if a 400MHz 2 SM GPU would consume any more than 1.5W. That's obviously well within the bounds for passive cooling, but even being very conservative with battery consumption it shouldn't be an issue. The savings from going from 400MHz to 300MHz would perhaps only increase battery life by about 5-10% tops, which makes it puzzling why they'd turn down the extra performance.
Finally, the recently published Switch patent application actually explicitly talks about running the fan at a lower RPM while in portable mode, and doesn't even mention the possibility of turning it off while running in portable mode. A 2 SM 20nm Maxwell GPU at ~300MHz shouldn't require a fan at all, and although it's possible that they've changed their mind since filing the patent in June, it begs the question of why they would even consider running the fan in portable mode if their target performance was anywhere near this.
Case 2: 3 SMs - Docked: 576 GF FP32 / 1,152 GF FP16 - Portable: 230.4 GF FP32 / 460.8 GF FP16
This is a bit closer to the performance level we've been led to expect, and it does make a little bit of sense from the perspective of giving a little bit over TX1 performance at lower power consumption. (It also matches reports of overclocked TX1s in early dev kits, as you'd need to clock a bit over the standard 1GHz to reach docked performance here.) Active cooling while docked makes sense for a 3 SM GPU at 768MHz, although wouldn't be needed in portable mode. It still leaves the question of why not use 1GHz/400MHz clocks, as even with 3 SMs they should be able to get by with passive cooling at 400MHz, and battery consumption shouldn't be that much of an issue.
Case 3: 4 SMs - Docked: 768 GF FP32 / 1,536 GF FP16 - Portable: 307.2 GF FP32 / 614.4 GF FP16
This would be on the upper limit of what's been expected, performance wise, and the clock speeds start to make more sense at this point, as portable power consumption for the GPU would be around the 2W mark, so further clock increases may start to effect battery life a bit too much (not that 400-500MHz would be impossible from that point of view, though). Active cooling would be necessary in docked mode, but still shouldn't be needed in portable mode (except perhaps if they go with a beefier CPU config than expected).
Case 4: More than 4 SMs
I'd consider this pretty unlikely, but just from the point of view of "what would you have to do to actually need active cooling in portable mode at these clocks", something like 6 SMs would probably do it (1.15 TF FP32/2.3 TF FP16 docked, 460 GF FP32/920 GF FP16 portable), but I wouldn't count on that. For one, it's well beyond the performance levels that reliable-so-far journalists have told us to expect, but it would also require a much larger die than would be typical for a portable device like this (still much smaller than PS4/XBO SoCs, but that's a very different situation).
TLR
Each of these numbers are only a single variable in the equation, and we need to know things like CPU configuration, memory bus width, embedded memory pools, number of GPU SMs, etc. to actually fill out the rest of those equations to get the relevant info. Even on the worst end of the spectrum, we're still getting by far the most ambitious portable that Nintendo's ever released, which also doubles as a home console that's noticeably higher performing than Wii U, which is fine by me.
Power isn't nearly as much of a reason as to what will hinge on whether or not third parties support this thing as "will people who own it actually buy third party games on the thing", is
There are still a lot of unknowns about the specs that we should hold of on Nintendooming about.. But still.
If games like Skyrim or Dark Souls Trilogy or whatever third party support makes it on switch sells well.. We'll get more stuff like that. If not, the third party support will dry up like it did almost immediately on the WiiU.
Nintendo could have made a system with Scorpio power but it wouldn't mean shit if people who owned it didn't buy the third party games that release on the thing. I'm hoping the portability aspect of the system actually draws interest from more people than Nintendo's previous "hooks", like waggle control on the Wii and Gamepad dual screen whatever on the WiiU,
Both would need to be the same thing! Or did you expected 1/2 hour of battery time or a handheld as big as the PS4?I expected a new home console which could become portable, not a portable console which could become stationary. If I understand the specs correctly it's not even close to PS4 and XB1, and this is 3-4 years after those consoles which are now getting upgraded. Switch comes out the same year as 6TF Xbox Scorpio . Not good, even for conservative Nintendo standards.
Very much doubt this bit. Those Japanese Vita devs have been primarily targeting PS4 for the last year or two and Switch is probably coming in too low now.
I agree. While I trust Digital Foundry, I'm still going to wait it out for the January reveal.
Nintendo *might* be releasing false info to start weeding out "leaks" in their mist.
I trust Nintendo learned their lessons with Wii U. Bring on January!
I am curious why did 3rd parties flock to the Wii tho. There were many Wii exclusive 3rd party games.
Were first few years sales that good? Were they on board from the beginning?
Using wikipedia right now, so many were on board the first months before they even knew how good sales were gonna be.
So what made them flock before they knew it was gonna be a hit?
And looking now, the Wii had alot of 3rd party support for multi platform games. I already knew this but I didnt realize it was this good...some games I didnt even realize came out for it.
And it wasnt all shovelware.
I guess it was Drake that gave the 5-8 hour number, so who knows now. But the slower speeds seem to imply they are gunning for battery life over performance.
lol @ all the "Give me PS4+ performance in a handheld for under $200 or I'm not buying!"
I am curious why did 3rd parties flock to the Wii tho. There were many Wii exclusive 3rd party games.
Were first few years sales that good? Were they on board from the beginning?
Using wikipedia right now, so many were on board the first months before they even knew how good sales were gonna be.
So what made them flock before they knew it was gonna be a hit?
And looking now, the Wii had alot of 3rd party support for multi platform games. I already knew this but I didnt realize it was this good...some games I didnt even realize came out for it.
And it wasnt all shovelware.
Excellent analysis as always, and I agree with that conclusion regarding the fan, though I'm starting to expect that I'll be disappointed again regardless of what I'm hoping for.
But another thing to consider is LKD's report that there is an additional fan in the dock. I know the patent didn't support that report but the patent was filed in June, and adding a fan to the dock is certainly a possible change from the patented specification. 2 fans would be a ridiculous amount of overkill if the clock rates listed are being ascribed to TX1 hardware.
Also, consider the DF article did mention that it's likely that some of the customizations to the SoC were from the Pascal architecture, so it's still certainly possible this will be on a 16nm process. Unlikely I'd say at this point, but possible.
When your only choices for a new portable game console is the Switch, where exactly do you think those Vita fans are going to go? The only other choice is to leave portables entirely & become a mobile fan (which many Japanese gamers have been doing) or become a home-only console fan (which is becoming less & less popular in Japan).
Instead of the Vita library, because some have been focusing on the PS4 (though not all would likely be able to do that) I would replace that with possibly premium mobile titles that exist. Like the ones Square Enix have been putting out that aren't stuff that could already be found on the virtual console. If the Switch can run mobile content just fine why not push out a Switch version to have another revenue source?
Nice to see the "only gameplay matters" crowd is still around to downplay issues on low power Nintendo hardware.
Not terribly surprising. That Tegra SoC was designed for a desktop application, not a handheld device. Had to cripple it to get decent battery life and low enough TDP for the form factor.
Thank you for writing this. Very informative.
So GAF, what is the most likely: 2 SMs or 3 SMs ?
I'm perfectly fine with this too... it's all I really wanted or expected.TLR
Each of these numbers are only a single variable in the equation, and we need to know things like CPU configuration, memory bus width, embedded memory pools, number of GPU SMs, etc. to actually fill out the rest of those equations to get the relevant info. Even on the worst end of the spectrum, we're still getting by far the most ambitious portable that Nintendo's ever released, which also doubles as a home console that's noticeably higher performing than Wii U, which is fine by me.
In any case, i feel the safest move for players is to wait to see if the console is selling and is getting games, and then take a decision. Wait and see if they bring the games you want, if third party keep supporting the console,etc.
In my opinion it depends on the fab node. If it's 20nm I don't see more than 2 SM. If it's 16nm I think 3 SM would be more probably simply because the fan would be totally useless for 2 SM.
Sure, but most of the time you can at least rationalize each console being the best it could or should have been at the time of release. Replaying old games, it's still satisfying to reflect on how that was pushing the boundaries at release.
Antiquated now, but stunning in 1998.
This isn't even out yet and looks acceptable by 2017 standards. If this is a Wii U screenshot and Switch looks outstandingly better then feel free to post a screenshot and I'll happily concede that I'm wrong.
Nintendo is the only one who consistently underperforms in that category. This isn't even going into the benefits of RAM and CPU.
Edit - also kudos to Thraktor for the great write-up
I haven't had time to read through every response here, so I'm probably repeating what others have already said, but here are my thoughts on the matter, anyway:
CPU Clock
This isn't really surprising, given (as predicted) CPU clocks stay the same between portable and docked mode to make sure games don't suddenly become CPU limited when running in portable mode.
The overall performance really depends on the core configuration. An octo-core A72 setup at 1GHz would be pretty damn close to PS4's 1.6GHZ 8-core Jaguar CPU. I don't necessarily expect that, but a 4x A72 + 4x A53 @ 1GHz should certainly be able to provide "good enough" performance for ports, and wouldn't be at all unreasonable to expect.
Memory Clock
This is also pretty much as expected as 1.6GHz is pretty much the standard LPDDR4 clock speed (which I guess confirms LPDDR4, not that there was a huge amount of doubt). Clocking down in portable mode is sensible, as lower resolution means smaller framebuffers means less bandwidth needed, so they can squeeze out a bit of extra battery life by cutting it down.
Again, though, the clock speed is only one factor. There are two other things that can come into play here. The second factor, obviously enough, is the bus width of the memory. Basically, you're either looking at a 64 bit bus, for 25.6GB/s, or a 128 bit bus, for 51.2GB/s of bandwidth. The third is any embedded memory pools or cache that are on-die with the CPU and GPU. Nintendo hasn't shied away from large embedded memory pools or cache before (just look at the Wii U's CPU, its GPU, the 3DS SoC, the n3DS SoC, etc., etc.), so it would be quite out of character for them to avoid such customisations this time around. Nvidia's GPU architectures from Maxwell onwards use tile-based rendering, which allows them to use on-die caches to reduce main memory bandwidth consumption, which ties in quite well with Nintendo's habits in this regard. Something like a 4MB L3 victim cache (similar to what Apple uses on their A-series SoCs) could potentially reduce bandwidth requirements by quite a lot, although it's extremely difficult to quantify the precise benefit.
GPU Clock
This is where things get a lot more interesting. To start off, the relationship between the two clock speeds is pretty much as expected. With a target of 1080p in docked mode and 720p in undocked mode, there's a 2.25x difference in pixels to be rendered, so a 2.5x difference in clock speeds would give developers a roughly equivalent amount of GPU performance per pixel in both modes.
Once more, though, and perhaps most importantly in this case, any interpretation of the clock speeds themselves is entirely dependent on the configuration of the GPU, namely the number of SMs (also ROPs, front-end blocks, etc, but we'll assume that they're kept in sensible ratios).
Case 1: 2 SMs - Docked: 384 GF FP32 / 768 GF FP16 - Portable: 153.6 GF FP32 / 307.2 GF FP16
I had generally been assuming that 2 SMs was the most likely configuration (as, I believe, had most people), simply on the basis of allowing for the smallest possible SoC which could meet Nintendo's performance goals. I'm not quite so sure now, for a number of reasons.
Firstly, if Nintendo were to use these clocks with a 2 SM configuration (assuming 20nm), then why bother with active cooling? The Pixel C runs a passively cooled TX1, and although people will be quick to point out that Pixel C throttles its GPU clocks while running for a prolonged time due to heat output, there are a few things to be aware of with Pixel C. Firstly, there's a quad-core A57 CPU cluster at 1.9GHz running alongside it, which on 20nm will consume a whopping 7.39W when fully clocked. Switch's CPU might be expected to only consume around 1.5W, by comparison. Secondly, although I haven't been able to find any decent analysis of Pixel C's GPU throttling, the mentions of it I have found indicate that, although it does throttle, the drop in performance is relatively small, and as it's clocked about 100MHz above Switch to begin with it may only be throttling down to a 750MHz clock or so even under prolonged workloads. There is of course the fact that Pixel C has an aluminium body to allow for easier thermal dissipation, but it likely would have been cheaper (and mechanically much simpler) for Nintendo to adopt the same approach, rather than active cooling.
Alternatively, we can think of it a different way. If Switch has active cooling, then why clock so low? Again assuming 20nm, we know that a full 1GHz clock shouldn't be a problem for active cooling, even with a very small quiet fan, given the Shield TV (which, again, uses a much more power-hungry CPU than Switch). Furthermore, if they wanted a 2.5x ratio between the two clock speeds, that would give a 400MHz clock in portable mode. We know that the TX1, with 2 SMs on 20nm, consumes 1.51W (GPU only) when clocked at about 500MHz. Even assuming that that's a favourable demo for the TX1, at 20% lower clock speed I would be surprised if a 400MHz 2 SM GPU would consume any more than 1.5W. That's obviously well within the bounds for passive cooling, but even being very conservative with battery consumption it shouldn't be an issue. The savings from going from 400MHz to 300MHz would perhaps only increase battery life by about 5-10% tops, which makes it puzzling why they'd turn down the extra performance.
Finally, the recently published Switch patent application actually explicitly talks about running the fan at a lower RPM while in portable mode, and doesn't even mention the possibility of turning it off while running in portable mode. A 2 SM 20nm Maxwell GPU at ~300MHz shouldn't require a fan at all, and although it's possible that they've changed their mind since filing the patent in June, it begs the question of why they would even consider running the fan in portable mode if their target performance was anywhere near this.
Case 2: 3 SMs - Docked: 576 GF FP32 / 1,152 GF FP16 - Portable: 230.4 GF FP32 / 460.8 GF FP16
This is a bit closer to the performance level we've been led to expect, and it does make a little bit of sense from the perspective of giving a little bit over TX1 performance at lower power consumption. (It also matches reports of overclocked TX1s in early dev kits, as you'd need to clock a bit over the standard 1GHz to reach docked performance here.) Active cooling while docked makes sense for a 3 SM GPU at 768MHz, although wouldn't be needed in portable mode. It still leaves the question of why not use 1GHz/400MHz clocks, as even with 3 SMs they should be able to get by with passive cooling at 400MHz, and battery consumption shouldn't be that much of an issue.
Case 3: 4 SMs - Docked: 768 GF FP32 / 1,536 GF FP16 - Portable: 307.2 GF FP32 / 614.4 GF FP16
This would be on the upper limit of what's been expected, performance wise, and the clock speeds start to make more sense at this point, as portable power consumption for the GPU would be around the 2W mark, so further clock increases may start to effect battery life a bit too much (not that 400-500MHz would be impossible from that point of view, though). Active cooling would be necessary in docked mode, but still shouldn't be needed in portable mode (except perhaps if they go with a beefier CPU config than expected).
Case 4: More than 4 SMs
I'd consider this pretty unlikely, but just from the point of view of "what would you have to do to actually need active cooling in portable mode at these clocks", something like 6 SMs would probably do it (1.15 TF FP32/2.3 TF FP16 docked, 460 GF FP32/920 GF FP16 portable), but I wouldn't count on that. For one, it's well beyond the performance levels that reliable-so-far journalists have told us to expect, but it would also require a much larger die than would be typical for a portable device like this (still much smaller than PS4/XBO SoCs, but that's a very different situation).
TLR
Each of these numbers are only a single variable in the equation, and we need to know things like CPU configuration, memory bus width, embedded memory pools, number of GPU SMs, etc. to actually fill out the rest of those equations to get the relevant info. Even on the worst end of the spectrum, we're still getting by far the most ambitious portable that Nintendo's ever released, which also doubles as a home console that's noticeably higher performing than Wii U, which is fine by me.
Weaker than Shield TV?
Nvidia are experts selling crap to console companies (remember PS3 GPU...)
I don't know a lot about tech, but this seems a lot worse than expected.
I guess I don't understand why companies like Take 2, Bethesda, and FROM software have said good things about the Switch? If everyone on here is having meltdowns, why is developer reception so much stronger than it was with the WiiU?
I suppose the good news here is that lower power draw means better battery and cheaper price. I can't help but feel a bit disappointed that base PS4/X1 games will need downgrading for Switch.
Like I said in the beginning, making this a hybrid just brings down the home console aspect. Wish they would have made two separate devices for home and handheld with a shared library.
Still really excited for the Switch. Just wish they had went in a slightly different direction.
I feel like there's still a chance for a home only model down the line, or at the very least a more powerful Switch.
maybe you should consider things out of just graphics, have you seen the physics and the size of breath of the wild? if thats not pushing bundaries i dont know what is
There's no fan in the dock.
I haven't had time to read through every response here, so I'm probably repeating what others have already said, but here are my thoughts on the matter, anyway:
CPU Clock
This isn't really surprising, given (as predicted) CPU clocks stay the same between portable and docked mode to make sure games don't suddenly become CPU limited when running in portable mode.
The overall performance really depends on the core configuration. An octo-core A72 setup at 1GHz would be pretty damn close to PS4's 1.6GHZ 8-core Jaguar CPU. I don't necessarily expect that, but a 4x A72 + 4x A53 @ 1GHz should certainly be able to provide "good enough" performance for ports, and wouldn't be at all unreasonable to expect.
Memory Clock
This is also pretty much as expected as 1.6GHz is pretty much the standard LPDDR4 clock speed (which I guess confirms LPDDR4, not that there was a huge amount of doubt). Clocking down in portable mode is sensible, as lower resolution means smaller framebuffers means less bandwidth needed, so they can squeeze out a bit of extra battery life by cutting it down.
Again, though, the clock speed is only one factor. There are two other things that can come into play here. The second factor, obviously enough, is the bus width of the memory. Basically, you're either looking at a 64 bit bus, for 25.6GB/s, or a 128 bit bus, for 51.2GB/s of bandwidth. The third is any embedded memory pools or cache that are on-die with the CPU and GPU. Nintendo hasn't shied away from large embedded memory pools or cache before (just look at the Wii U's CPU, its GPU, the 3DS SoC, the n3DS SoC, etc., etc.), so it would be quite out of character for them to avoid such customisations this time around. Nvidia's GPU architectures from Maxwell onwards use tile-based rendering, which allows them to use on-die caches to reduce main memory bandwidth consumption, which ties in quite well with Nintendo's habits in this regard. Something like a 4MB L3 victim cache (similar to what Apple uses on their A-series SoCs) could potentially reduce bandwidth requirements by quite a lot, although it's extremely difficult to quantify the precise benefit.
GPU Clock
This is where things get a lot more interesting. To start off, the relationship between the two clock speeds is pretty much as expected. With a target of 1080p in docked mode and 720p in undocked mode, there's a 2.25x difference in pixels to be rendered, so a 2.5x difference in clock speeds would give developers a roughly equivalent amount of GPU performance per pixel in both modes.
Once more, though, and perhaps most importantly in this case, any interpretation of the clock speeds themselves is entirely dependent on the configuration of the GPU, namely the number of SMs (also ROPs, front-end blocks, etc, but we'll assume that they're kept in sensible ratios).
Case 1: 2 SMs - Docked: 384 GF FP32 / 768 GF FP16 - Portable: 153.6 GF FP32 / 307.2 GF FP16
I had generally been assuming that 2 SMs was the most likely configuration (as, I believe, had most people), simply on the basis of allowing for the smallest possible SoC which could meet Nintendo's performance goals. I'm not quite so sure now, for a number of reasons.
Firstly, if Nintendo were to use these clocks with a 2 SM configuration (assuming 20nm), then why bother with active cooling? The Pixel C runs a passively cooled TX1, and although people will be quick to point out that Pixel C throttles its GPU clocks while running for a prolonged time due to heat output, there are a few things to be aware of with Pixel C. Firstly, there's a quad-core A57 CPU cluster at 1.9GHz running alongside it, which on 20nm will consume a whopping 7.39W when fully clocked. Switch's CPU might be expected to only consume around 1.5W, by comparison. Secondly, although I haven't been able to find any decent analysis of Pixel C's GPU throttling, the mentions of it I have found indicate that, although it does throttle, the drop in performance is relatively small, and as it's clocked about 100MHz above Switch to begin with it may only be throttling down to a 750MHz clock or so even under prolonged workloads. There is of course the fact that Pixel C has an aluminium body to allow for easier thermal dissipation, but it likely would have been cheaper (and mechanically much simpler) for Nintendo to adopt the same approach, rather than active cooling.
Alternatively, we can think of it a different way. If Switch has active cooling, then why clock so low? Again assuming 20nm, we know that a full 1GHz clock shouldn't be a problem for active cooling, even with a very small quiet fan, given the Shield TV (which, again, uses a much more power-hungry CPU than Switch). Furthermore, if they wanted a 2.5x ratio between the two clock speeds, that would give a 400MHz clock in portable mode. We know that the TX1, with 2 SMs on 20nm, consumes 1.51W (GPU only) when clocked at about 500MHz. Even assuming that that's a favourable demo for the TX1, at 20% lower clock speed I would be surprised if a 400MHz 2 SM GPU would consume any more than 1.5W. That's obviously well within the bounds for passive cooling, but even being very conservative with battery consumption it shouldn't be an issue. The savings from going from 400MHz to 300MHz would perhaps only increase battery life by about 5-10% tops, which makes it puzzling why they'd turn down the extra performance.
Finally, the recently published Switch patent application actually explicitly talks about running the fan at a lower RPM while in portable mode, and doesn't even mention the possibility of turning it off while running in portable mode. A 2 SM 20nm Maxwell GPU at ~300MHz shouldn't require a fan at all, and although it's possible that they've changed their mind since filing the patent in June, it begs the question of why they would even consider running the fan in portable mode if their target performance was anywhere near this.
Case 2: 3 SMs - Docked: 576 GF FP32 / 1,152 GF FP16 - Portable: 230.4 GF FP32 / 460.8 GF FP16
This is a bit closer to the performance level we've been led to expect, and it does make a little bit of sense from the perspective of giving a little bit over TX1 performance at lower power consumption. (It also matches reports of overclocked TX1s in early dev kits, as you'd need to clock a bit over the standard 1GHz to reach docked performance here.) Active cooling while docked makes sense for a 3 SM GPU at 768MHz, although wouldn't be needed in portable mode. It still leaves the question of why not use 1GHz/400MHz clocks, as even with 3 SMs they should be able to get by with passive cooling at 400MHz, and battery consumption shouldn't be that much of an issue.
Case 3: 4 SMs - Docked: 768 GF FP32 / 1,536 GF FP16 - Portable: 307.2 GF FP32 / 614.4 GF FP16
This would be on the upper limit of what's been expected, performance wise, and the clock speeds start to make more sense at this point, as portable power consumption for the GPU would be around the 2W mark, so further clock increases may start to effect battery life a bit too much (not that 400-500MHz would be impossible from that point of view, though). Active cooling would be necessary in docked mode, but still shouldn't be needed in portable mode (except perhaps if they go with a beefier CPU config than expected).
Case 4: More than 4 SMs
I'd consider this pretty unlikely, but just from the point of view of "what would you have to do to actually need active cooling in portable mode at these clocks", something like 6 SMs would probably do it (1.15 TF FP32/2.3 TF FP16 docked, 460 GF FP32/920 GF FP16 portable), but I wouldn't count on that. For one, it's well beyond the performance levels that reliable-so-far journalists have told us to expect, but it would also require a much larger die than would be typical for a portable device like this (still much smaller than PS4/XBO SoCs, but that's a very different situation).
TLR
Each of these numbers are only a single variable in the equation, and we need to know things like CPU configuration, memory bus width, embedded memory pools, number of GPU SMs, etc. to actually fill out the rest of those equations to get the relevant info. Even on the worst end of the spectrum, we're still getting by far the most ambitious portable that Nintendo's ever released, which also doubles as a home console that's noticeably higher performing than Wii U, which is fine by me.
How could Laura Emily be wrong tho!?!?Some people actually expected a powerful Nintendo hardware. Lol
I means Nintendo cares about battery life enough to cut performance by 60%...
So probably.
Without knowing the CUDA core count we can't say for sure.
Lowest estimates put the raw power of it slightly above the Iphone 6S GPU when in handheld mode. Keep in mind this would be unhampered by IOS and typical phone game creation constraints.
Lowest estimates put it about 6x the Vita in handheld mode
So >2x WiiU when docked right? Wasnt expecting much more to be honest.
Edit: if it is 200 - 250 bucks there is no excuse not to get it.
Thanks for the write-up. Let's see how many people read it.
I predict 3.
lol @ all the "Give me PS4+ performance in a handheld for under $200 or I'm not buying!"
Seriously.