• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Inside the Scorpio Engine: the processor architecture deep dive

Space_nut

Member
http://www.eurogamer.net/articles/digitalfoundry-2017-the-scorpio-engine-in-depth

CPU Customizations
"Typically for CPU, the top two items are frequency and memory latency. If the CPU has data, the faster it can process it, the quicker the result, but it also means that if it doesn't have the data, it sits there idle, so latency is a big component. On frequency, we pushed it up to 2.3GHz" explains Nick Baker "On the latency, a couple of the areas we tackled, one was all the queues coming back from the memory interface, we sped those up as well. Specifically, within the core, because we're running a virtualised OS environment, we wanted to optimise how memory translation operations happen so there are some key changes inside the core to speed those things up. The end result is that not only does the CPU run faster, it also runs more efficiently meaning more power for you at the end."

GPU Customizations

According to Goossen, some performance optimisations from the upcoming AMD Vega architecture factor into the Scorpio Engine's design, but other features that made it into PS4 Pro - for example, double-rate FP16 processing - do not. However, customisation was extensive elsewhere. Microsoft's GPU command processor implementation of DX12 has provided big wins for Xbox One developers, and it's set for expansion in Scorpio.

"We did multiple PIX captures from every single game and ran them on the emulator," Andrew Goossen, Technical Fellow, Graphics, tells us - a process that proved invaluable for validating Scorpio's back-compat capabilities. "We did over 30,000 emulator runs, which is a big contributor to Nick's total cycle count because we had to make sure that we were going to land with that 100 per cent compatibility [with Xbox One]."

But to what extent do those customisations elevate Scorpio beyond a PC equipped with a notional, baseline Radeon equivalent to Scorpio's GPU - no customisation but 'the same teraflops'. After the presentation, that's exactly what I asked Microsoft in the first of a couple of follow-up rounds of questions conducted over email.

"Our performance analysis and modelling was so core to the entire design process of optimisation and adjustments that I don't have a specific example to call out," says Andrew Goossen. "We put every change we considered through the model. But in terms of 'more from your teraflops', I will point out that Scorpio has significant performance benefits relative to PC:

"Microsoft has made continual improvements to the shader compiler. We see significant performance wins for Xbox game content relative to compiling the same shaders on PC. [Secondly], 'to the metal' API and shader extension support allows developer to optimise in ways that simply can't be done on PC cards. [Finally], PIX provides low level analysis and insight that, in conjunction with 'to the metal' support, allows developers to make the most of the console GPU. These technologies are all already mature and familiar to developers, so Scorpio games will benefit from the get go."

4k asset me if old
 

gamz

Member
I posted this in the other thread:

"We did multiple PIX captures from every single game and ran them on the emulator," Andrew Goossen, Technical Fellow, Graphics, tells us - a process that proved invaluable for validating Scorpio's back-compat capabilities. "We did over 30,000 emulator runs, which is a big contributor to Nick's total cycle count because we had to make sure that we were going to land with that 100 per cent compatibility [with Xbox One]."

That's some exhaustive work!
 

geordiemp

Member
Delta colour compression and Bigger L2 cache so 32 ROPs is good is what I got from the DF article.

Quite light on new info or did I miss anything new ?

Not much of a deep dive when they don't even talk about what customization they did to the CPU or GPU.

yeah its all talk and fluff about how they did it, not technically WHAT they did.

Also MS dont like to call the Jaguar a Jaguar, its a fluffy cloud lol.
 
The lack of RPM is strange, but it does indicate that Scorpio is closer to Polaris than PS3 Pro (though neither is off-the-shelf).

The article also mentions that the "60 customizations" Microsoft did to the SOC are not all changes to how the silicon is architected. They're also counting customizations in the sense of choices about how many ROPs, how many shader cores, how much L2, etc. That is, not unique to Scorpio. EDIT: I'm sorry, this is a mistake. The larger choices like number of CUs, etc. are not included in that "60 customizations". These are smaller changes, some also cache size decisions, but some about feature inclusion.

And despite the CPU being called "Jaguar evolved" in a previous article, they mention only a single change besides the upclock: quicker speed for memory translation operations. (Or does that come naturally with the upclock and isn't a separate thing?)
 
I'd like to hear what the plan is to fit these games, with their 4k assets, on to discs.

We're already getting games that are taking up full discs, are we going to have to download 4k assets too?
 
Some other bits
"It's a completely unique design... you wouldn't be able to buy this anywhere else and really, we created this is in conjunction with AMD and it is a nice unique part for Scorpio," says Nick Baker, Distinguished Engineer, Silicon.
Years before any silicon arrived back from chip manufacturing giant TSMC, the Xbox team began by carrying a vast range of simulation and analysis. As Project Scorpio is effectively a mid-generation refresh - an extension of the existing console designed primarily for 4K screens - existing game code captured at a granular level via the PIX (Performance Investigator for Xbox) tool could be run on potential hardware designs, well before Microsoft went to AMD.
"This has two wonderful virtues from my perspective - as you know, the clock drives all the various different parts of the pipeline so it raises all boats," explains Andrew Goossen. "I don't get imbalances in my pipeline or introduce new bottlenecks or anything like that. The second one is that for the pixel pushing power we didn't need as much area, we didn't need as many CUs to hit that. It saves area - a pretty important consideration. We were 853MHz in Xbox One, we dialed it up to 1.172 GHz (1172MHz). That's a 37 per cent increase in clock, more than our CPU clock relatively. The next big one: we have 40 CUs. When you take 1172 multiplied by 40 multiplied by 64 for ops multiplied by 2 FLOPS per op, you get exactly 6.0TF."

Compared to Xbox One, the amount of shader engines doubles, which combined with the frequency boost sees triangle and vertex rate rise by 2.7x. The GPU's L2 cache gets a 4x increase, which Goossen says is there for targeting 4K performance.
Certainly in the PC space, GPU performance tends to scale with bandwidth - the more you increase compute power, the faster the memory you need to get best performance. It's an area where Sony were constrained in the PS4 Pro design. Having settled on 8GB of RAM, the only way to increase bandwidth and maintain compatibility was to swap in faster modules. There's a 2.3x increase in compute power, but only a 24 per cent increase in bandwidth. Microsoft ditched Xbox One's DDR3 and ESRAM combo, and moved to GDDR5.
Baker explains that with those variables in place, the decision in targeting the amount of memory Project Scorpio would address is essentially made for you, and in our E3 2016 Scorpio speculation, that same logic led us to conclude that the machine would indeed deliver 12GB of capacity. Four gigs is reserved for the system (an extra 1GB there is utilised for a full 4K dashboard), leaving 8GB for game developers - a substantial increase over the 5GB used on Xbox One, and indeed PlayStation 4 Pro.
"We can do full fidelity, incredibly high bit-rate GameDVR recording of your 4K60 experience," says Andrew Goossen. "You'll be able to play back locally at full fidelity and when you upload to YouTube you can automatically transcode - you can send up the raw thing as well, but typically we'll be doing a transcode to h.264 as part of that. We're also supporting HDR and SDR GameDVR so you'll be able to enjoy the full fidelity of the HDR experience, the challenge being - as you know - getting more platforms so you can actually view these.
"We wanted [native 1080p Xbox One games] to run at full native 4K with a rock-solid frame-rate with a whole bunch of performance left over to showcase and actually improve the visual experience in many other ways beyond render resolution," Andrew Goossen tells us. "And then our other goal was that we wanted to get 900p games up to full native 4K. That's a little bit harder. Some of 900p games - day one port - they should be running fine, solid at 2160p. For other games it's going to be more work than you'll traditionally do in terms of console optimisation but we wanted to get those 900p games at 2160p."
"For the very small handful of titles that run at 720p today, our expectation is that they can checkerboard up to native 4K if they want to do that. I also expect variations of titles that are perhaps running at 900p at 30fps on Xbox One today that they can leverage the 31 per cent boost to CPU clock along with a bunch of other optimisations in conjunction with our D3D12 offload to potentially offer 1080p60 rather than 900p30. It's totally up to developers."
 

LordOfChaos

Member
Huh, I had taken double packed FP16 for granted in this, so it doesn't actually have that.

Memory controller improvements were expected. Still unclear how much moving DX12 draw calls into dedicated hardware saves, DX12 calls I expect were only a small fraction of the CPU load already.


"Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result. On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles."

XBO part was probably on DX11, while the PS4s GCN was already taking just a few cycles and DWORDs per draw call.



PIX has always been an Xbox advantage (partly why the 360 was so much easier to tap into), that's a novel use of it to use it in a VM to do those silicon optimizations.
 

Hoo-doo

Banned
60fps console Dark souls 3 could be game changer, making Scorpio the console of choice for souls like.

At this point I feel it's too late to sway anyone with DS3, better to just focus on Dark Souls 4 and make that the definitive version.
And I think Souls afficionados will be hesitant to jump to Scorpio when there's the potential for a Bloodborne sequel.
 

Space_nut

Member
Loving the work they did. Knowing some Vega gpu features are in is great to know. Basically polaris+some Vega enhancements
 

Raide

Member
The emulation of hardware is fascinating. As was the various modifications for particular engines, not just a blanket thing.
 
^ yeah, that's pretty cool. Manually checking all of the games and optimizing for the major game engines.

Looking forward to the article tomorrow too. The drip feed continues... Though this seems like it'll be the end of the coverage they got from their visit?
 
Much of the initial word on it seems to have been marketing speak, a lot of which we would have expected, but the "60 customization" by looking at actual game engines being stuff like ROP counts is a bit of a stretch of the word "custom"
Check my edit: number of ROPs is not included, I was mistaken. Here's the direct quote from Microsoft (emphasis mine):

Nick Baker said:
The design from AMD lets us choose a lot of options in terms of number of SEs, CUs, the render back-end, the RBs that do the pixel blending, the cache sizes.... Aside from that there were also 60 or so specific targeted changes throughout the pipeline. Everything from various memory sizes, queue sizes, features....

But since the "customizations" include things like queue sizing, it does feel like a little bit of oversell.
 

Space_nut

Member
This is how I see the 2017 launch allowed for MS in the SOC:

6 TF vs. 4.2 TF
2.3 GHz vs. 2.1 GHz
326 GB/s vs. 218 GB/s
12 GB vs. 8+1 GB
HDMI 2.1 VRR vs. HDMI 2.0
And 60 customization in the hardware silicon to reduce bottlenecks that they profiled from actual games
And streamline their development toolkits
And lastly to get the cost down to their target point

Good call :)
 

Theorry

Member
"We did over 30,000 emulator runs, which is a big contributor to Nick's total cycle count because we had to make sure that we were going to land with that 100 per cent compatibility [with Xbox One]."

Damn.
 

LordOfChaos

Member
Check my edit: number of ROPs is not included, I was mistaken. Here's the direct quote from Microsoft (emphasis mine):



But since the "customizations" include things like queue sizing, it does feel like a little bit of oversell.

Ah, yeah, I went to read that paragraph after you said it and didn't see them say that.
 

Space_nut

Member
Check my edit: number of ROPs is not included, I was mistaken. Here's the direct quote from Microsoft (emphasis mine):



But since the "customizations" include things like queue sizing, it does feel like a little bit of oversell.

I think the biggest thing is that MS had tested all the big games and seen all the bottlenecks devs were facing. And then they customized the gpu to handle them. Which is why the ForzaTech demo ran as good as the 1070 gpu. E3 will be good to see how games run on Scorpio compared to other gpus

"Those are kind of the big items but we also leveraged the fact that we understand the AMD architecture really, really well now and how well it does on our games," adds Goossen "So we were able to go through and examine a lot of the internal queues and buffers and caches and FIFOs that make up this very deep pipeline that if you can find the right areas that are causing bottlenecks, for very small area we could increase those sizes and get effective wins. This was a very big focus of ours to go through and you basically really leverage that understanding of having those years of looking at performance on the Xbox One."

Increasing queue sizing and caches on the individual specs is a good achievement in it's own. They took every one of those things and manually increased and sped up the many individual parts on the gpu. Every part was improved and/or made larger.
 

blastprocessor

The Amiga Brotherhood
According to Goossen, some performance optimisations from the upcoming AMD Vega architecture factor into the Scorpio Engine's design, but other features that made it into PS4 Pro - for example, double-rate FP16 processing - do not.

Interesting.
 

Drain You

Member
I'd like to hear what the plan is to fit these games, with their 4k assets, on to discs.

We're already getting games that are taking up full discs, are we going to have to download 4k assets too?

I can't see it being that different. Just look at PC downloads for reference.
 

onQ123

Member
This is how I see the 2017 launch allowed for MS in the SOC:

6 TF vs. 4.2 TF
2.3 GHz vs. 2.1 GHz
326 GB/s vs. 218 GB/s
12 GB vs. 8+1 GB
HDMI 2.1 VRR vs. HDMI 2.0
And 60 customization in the hardware silicon to reduce bottlenecks that they profiled from actual games
And streamline their development toolkits
And lastly to get the cost down to their target point

Good call :)

Actually it's 6tf fp32 vs 4.2tf fp32/8.4tf fp16
 

jmga

Member
Actually it's 6tf fp32 vs 4.2tf fp32/8.4tf fp16

auunp3.jpg
 

Space_nut

Member
But to what extent do those customisations elevate Scorpio beyond a PC equipped with a notional, baseline Radeon equivalent to Scorpio's GPU - no customisation but 'the same teraflops'. After the presentation, that's exactly what I asked Microsoft in the first of a couple of follow-up rounds of questions conducted over email.

"Our performance analysis and modelling was so core to the entire design process of optimisation and adjustments that I don't have a specific example to call out," says Andrew Goossen. "We put every change we considered through the model. But in terms of 'more from your teraflops', I will point out that Scorpio has significant performance benefits relative to PC:

"Microsoft has made continual improvements to the shader compiler. We see significant performance wins for Xbox game content relative to compiling the same shaders on PC. [Secondly], 'to the metal' API and shader extension support allows developer to optimise in ways that simply can't be done on PC cards. [Finally], PIX provides low level analysis and insight that, in conjunction with 'to the metal' support, allows developers to make the most of the console GPU. These technologies are all already mature and familiar to developers, so Scorpio games will benefit from the get go."

yes :)
 

dr_rus

Member
Lack of double speed FP16 is really surprising here considering that Neo got it last year. Maybe MS thought that it's not that helpful for what Scorpio is about.
 

geordiemp

Member
Since this is fluff pieces, this is all paid advertising, right?

Its a talkative article, the hard new facts are delta colour compression and L2 cache size bigger.

Actually those 2 are really strong, L2 will help the 32 ROPS and colour compression means Scorpio will be probably comfortable on bandwidth it seems.

Could you see anything else ?

The rest was to pad it out I guess. Its allot of talk for those 2 titbits, and the focus was on repeating the specs we got last week. Maybe we will get 2 facts per week, I dont know, but is the title really in depth ?

My take away - GPU nice and strong, RAM plenty, Bandwidth good for 4K, CPU still shit but lets call it summit else lol.

Good design for 4K, but will be CPU limited like the pro and we wont see much 30 to 60 FPS until next gen.
 
Top Bottom