• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Substance Engine benchmark implies PS4 CPU is faster than Xbox One's

Codeblew

Member
No offence to the chap, I'd like to understand what he's worked on and working on to believe him flat-out.

Since he's got an original username though, that speaks volume.

If he tells us who he works for, then he can't then go on to leak info now can he? He has been verified by mods.

LLVM with a Clang front, iirc.


Nice! Clang really is a great compiler. I am glad it is getting even more support now.
 
There is another possible explanation for the performance difference. MS have many times mentioned they run 3? different OS on the xbox one at once in a virtualizer. (IE: likely some form of their Hyper-V virtualizer)

Well if the test was run inside a VM container on top of Hyper-V it will not get the same performance as being run natively. Hyper-V performance is anywhere between ~70% and ~95% of native. ( Depending on the type of compute ) We do not know how the PS4 OS runs code either, but its possible that code run on PS4 does NOT run in a full virtualizer but instead some sort of sandboxing, which would yield native hardware performance.

So even if running on a faster cpu the benchmark could run slower in a VM. With my tinfoil hat on... if I realized that I'm taking a performance hit because I'm forcing games to run in a virtualizer... I might up the CPU clock rate to insure performance parity with my competitor that is not.
Hey, now that's an explanation I can actually buy!!

If this is the case, it certainly is possible and even likely that REAL/TUNED for a VM code would ACTUALLY end up faster on the Xbox One in the future.
Pretty sure Substance is "real" code…

Assuming they haven't already done so, how exactly would Allegorithmic "tune for the VM"? If, for example, the virtualization layer causes multiply-accumulates to run at 70% of native speed, what options do you have, apart from not doing them anymore? It seems that tuning would need to be done in Hyper-V itself, which has already been around for five years… =/

Did you mean faster than PS4, or just faster than it is currently?
 

stryke

Member
It is a weak laptop cpu, hence Sony loaded up on compute too.

Oh I know. Just laughing at the fact not even Sony are beating around the bush with regards to the cpu. I suppose it is enough for a console but still relatively weak overall.
 

Chumpion

Member
Interesting:
Mrq17Ud.png

Grow a brain people. The slide merely states a historical performance statistic. It could come from a devkit from way back.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
Grow a brain people. The slide merely states a historical performance statistic. It could come from a devkit from way back.

I did not quote this piece for the stated clock clock speed, but for its comparison between Jaguar and an "i7" (10% performance difference at the same clock speed for this particular code), which I find interesting and more informative than what is usually presented on the matter.
 
I did not quote this piece for the stated clock clock speed, but for its comparison between Jaguar and an "i7" (10% performance difference at the same clock speed for this particular code), which I find interesting and more informative than what is usually presented on the matter.
To be fair, I had no idea that's what you were getting at, and made the same assumption as Chumpion.
 

Chumpion

Member
I did not quote this piece for the stated clock clock speed, but for its comparison between Jaguar and an "i7" (10% performance difference at the same clock speed for this particular code), which I find interesting and more informative than what is usually presented on the matter.

My bad brah, I wasn't replying to you. I just wanted to quote the image. Go with Christ brah!
 

mrklaw

MrArseFace
So this is the chatter I heard recently on Sony's 8GB bombshell.

In early 2012 SCE were told by developers that their 4GB target was too low and that they needed a minimum of 8GB for next gen development to be competitive over a 5-7 year period. In the middle of 2012 Sony approached Samsung over the viability of low power 4Gbit chips and were willing to give seed money for investment to that end. The rest as they say is history...

It just surprises me that Microsoft engineers didn't look at this path after revealing to developers that their system would have a pitifully small amount of bandwidth.

Especially as - if 4Gb is too low, then surely 5GB isn't much better? (Free ram for games on Xbox one). Unless devs were talking total ram, and when Sony were pitching 4Gb they were worried they'd only have 2-3Gb for games
 

ThirdMartini

Neo Member
Hey, now that's an explanation I can actually buy!!


Pretty sure Substance is "real" code…

Assuming they haven't already done so, how exactly would Allegorithmic "tune for the VM"? If, for example, the virtualization layer causes multiply-accumulates to run at 70% of native speed, what options do you have, apart from not doing them anymore? It seems that tuning would need to be done in Hyper-V itself, which has already been around for five years… =/

Did you mean faster than PS4, or just faster than it is currently?

There are several things that cause performance issues in a vm. Certain memory access (page faults, or TLB access) being an obvious biggie. Anything that uses interrupts (Such as a CPU assisted CRC, DMA, COPY, etc ) , or profile counters, will all cause an exit to the VMM which can be damn expensive if we have a lot of those. Other issues specific to heavy compute is L1/L2 Cache utilization. If the code consumes all of the L1 and.or L2 cache a VMM exit will flush part of the caches. This again can cause a significant performance hit.

Changing your app to consume less cache, could actually yield higher performance on a virtualized platform. (Not quite native hardware level.. but damn close)

The other thing we don't know (besides my speculation of running in a VM) , is what is the cost of the VMM on a per core basis on the xbox one? Does the VMM wakeup periodically? (in which case it will consume some percent of the cpu) If it is very heavy say 15% cpu time, then thats something prime for MS to address in the future.

If we are in a VM and the xbox one cpu is at 1.75 vs 1.6 on the PS4 and the performance loss is caused by the VM and untuned code. Then I can for-see in the future where retuned code and VMM improvements would result in the same test running faster in the future.
 
Unfortunately game devs seem to have a bad case of Not Implemented Here, and tend not to trust the standard libs and advanced C++ features.

If you dig up the Sony LLVM presentation they say that their compiler defaults to no exceptions and no RTTI.

But I think hope that that'll start to change this generation.

I agree that it probably will change this generation.

If I was programming a multi-platform game, I would probably not be able to use C++x11 features because I need to support older compilers for older hardware. Also, if I were currently writing a game, I would probably continue to optimize my code for PowerPC (often by using temp variables in order to avoid the load-hit-store). This will change as the Xbox 360 and PS3 die down, and games can eventually be developed using the more modern standards.

I have worked in libraries maintained by Google, and the libraries they use and maintain also do not use exceptions due to their code often not being able to handle exceptions properly. Unfortunately, when a large legacy system already exists which is not able to handle exceptions properly, it stifles the capabilities of the newer system.
 

blu

Wants the largest console games publisher to avoid Nintendo's platforms.
Doesn't sound like the kind of software that would have to talk to APIs a lot, and I wonder what x86 compiler Sony could use that's that much faster than MS's. Does Intel still make their heavy duty optimizing compiler? I get the funny feeling the critical sections probably use some assembly on x86 anyway.
Sony's using clang/llvm. Intel's still making their compiler (and it's still among the top-notch ones, even though their standards compliance has been lagging), but that's got nothing to do with jaguar.
 
I wish Cerny would enlighten us about the CPU some more.

It's only a matter of time until a Dev lets slip with final CPU speeds, available cores & ram.

For what it's worth, TDP increases by over 60% running these cores @2Ghz compared to the stock 1.6Ghz, so I don't think there's a chance in hell Sony would risk it, given the thermals involved.

Much more likely it's just a case of the Xbox having higher overheads than the PS4.
 
It's only a matter of time until a Dev lets slip with final CPU speeds, available cores & ram.

For what it's worth, TDP increases by over 60% running these cores @2Ghz compared to the stock 1.6Ghz, so I don't think there's a chance in hell Sony would risk it, given the thermals involved.

Much more likely it's just a case of the Xbox having higher overheads than the PS4.
I thought 2 GHz was stock for the Jaguar. Does underclocking them really save that much power? That's pretty significant.
 

RoboPlato

I'd be in the dick
I wish Cerny would enlighten us about the CPU some more.

I just find it odd how they've been so open about everything else but have barely even said anything about the CPU aside from 8 jaguar cores. They've broken down in depth stuff that never is outlined by official specs but the CPU has barely been mentioned.
 

vpance

Member
It's only a matter of time until a Dev lets slip with final CPU speeds, available cores & ram.

For what it's worth, TDP increases by over 60% running these cores @2Ghz compared to the stock 1.6Ghz, so I don't think there's a chance in hell Sony would risk it, given the thermals involved.

Much more likely it's just a case of the Xbox having higher overheads than the PS4.

60% of what though. Like 20w to 30w?
 

coldfoot

Banned
For what it's worth, TDP increases by over 60% running these cores @2Ghz compared to the stock 1.6Ghz, so I don't think there's a chance in hell Sony would risk it, given the thermals involved.
Incorrect. In that test they clock the Jaguar to 2GHz, they're also significantly overclocking the GPU, so the TDP increase is mostly due to that.
 

vpance

Member
Incorrect. In that test they clock the Jaguar to 2GHz, they're also significantly overclocking the GPU, so the TDP increase is mostly due to that.

I think they're clocked independently, no?

I remember thuway said to not to rule out over clocks later on in the gen.
 

coldfoot

Banned
I think they're clocked independently, no?

I remember thuway said to not to rule out over clocks later on in the gen.

Yes the chip has independent clocks, but the entire basis for the "60% more TDP" comes from here:

As you can see, the 2Ghz chip also has a faster GPU core, and TDP doesn't really say much about actual power consumption (a bunch of A4 chips have 15W TDP despite at different clocks), so it's incorrect to assume that Going from 1.6->2 Ghz in the CPU only is going to result in a massive increase in heat and power consumption.

I remember thuway said to not to rule out over clocks later on in the gen.
For that to happen Sony would need to certify each chip at the higher clock but then ship them out clocked lower. They did this with PSP, but no reason to do it with a PS4, as it's not a device that runs off of batteries.
 
There are several things that cause performance issues in a vm. Certain memory access (page faults, or TLB access) being an obvious biggie. Anything that uses interrupts (Such as a CPU assisted CRC, DMA, COPY, etc ) , or profile counters, will all cause an exit to the VMM which can be damn expensive if we have a lot of those. Other issues specific to heavy compute is L1/L2 Cache utilization. If the code consumes all of the L1 and.or L2 cache a VMM exit will flush part of the caches. This again can cause a significant performance hit.

Changing your app to consume less cache, could actually yield higher performance on a virtualized platform. (Not quite native hardware level.. but damn close)

The other thing we don't know (besides my speculation of running in a VM) , is what is the cost of the VMM on a per core basis on the xbox one? Does the VMM wakeup periodically? (in which case it will consume some percent of the cpu) If it is very heavy say 15% cpu time, then thats something prime for MS to address in the future.

If we are in a VM and the xbox one cpu is at 1.75 vs 1.6 on the PS4 and the performance loss is caused by the VM and untuned code. Then I can for-see in the future where retuned code and VMM improvements would result in the same test running faster in the future.
Thanks for the response, and sorry for the late reply. I think I followed most of that. :D

I do have a question about the caches though. I was always under the impression those were handled directly by the hardware without dev intervention. How can a developer choose not to use them, or use less of them? That's why Sony added the Onion+ bus to PS4; to allow the procs to bypass the caches, right? Is this just an option on virtualized machines or something? How can the performance of uncached, virtualized code be "damn near" the performance of cached, native code? =/

In any case, wouldn't Allegorithmic, MS, and friends have already plucked the low-hanging fruit from this particular tree? It seems far from guaranteed we'd see significant performance gains in the future, particularly from the VM itself. </noob>
 

satam55

Banned
This presentation explicitly says 1.6GHz, and it's dated November 19:
http://www.slideshare.net/DevCentralAMD/mm-4085-laurentbetbeder

Does anyone have the PDF? I want to get a closer look at the Processor on page 3 I wonder if that's the PS4 design or just something they used for the presentation?


it look a lot like this one

5148d1349996342-ts1.jpg



Edit: never mind

diagram-dataplane-n-sm.jpg

I'm guessing that it's a DPU

What's a DPU?

slide-3-1024.jpg


diagram-dataplane-n.jpg



dpu-new.jpg

Those slides are very interesting and kind of confirm that the secondary ARM processor and the DPU we see mentioned are quite the same thing basically. Documentation in those slides and from Tensilica offer quite a lot of nice tidbits.

Reading those slides made me think about sound processing on the GPU and it's presented WHY's and WHY NOT's... I seem to recall people talking about AMD's audio technology built in PS4's GPU, yet this SCEA paper does not seem to make any mention of that when discussing the pros and cons of running game audio code on CPU vs GPU vs ACP/Secondary Processor... at an AMD conference nonetheless.

Curious don't you think?

ohhh.png
Did anyone else notice from those slides & info that the PS4 has a dedicated block for voice/speech recognition, echo-cancellation, & other enhancements like SHAPE on XBox1 does? It seems like ever since it was announced that the PS4's audio chip is based on AMD's Trueaudio DSP, all those advantages that SHAPE was going to give the Xbox1 over the PS4, disappeared.
 
Wow go on vacation miss interesting thread

So is there a general consensus on whether this indicates a faster PS4 CPU clock, less Cores reserved on PS4, or some weird OS issue on XB1?
 

pswii60

Member
ohhh.png
Did anyone else notice from those slides & info that the PS4 has a dedicated block for voice/speech recognition, echo-cancellation, & other enhancements like SHAPE on XBox1 does? It seems like ever since it was announced that the PS4's audio chip is based on AMD's Trueaudio DSP, all those advantages that SHAPE was going to give the Xbox1 over the PS4, disappeared.
I thought Trueaudio was a SDK and software solution on PS4 whereas SHAPE is a separate dedicated hardware solution?
 

satam55

Banned
Would this be good for something like remote-play.... or dare I say it, Forteleza?

Sony could be caught off guard by going cheap with their wireless. I mean it's not like they are going to fracture the userbase by upgrading to compete... should have had the fancy-shmancy wifi's in the first place.

I mean, surely Sony sees Forteleza coming, don't they? I wonder what their response will be.

The way Sony is pushing Remote Play, Second Screen, & Gaikai with the PS4, I think we'll likely see the wireless chip upgraded when releases a new PS4 model/hardware revision.

Also, it looks like Forteleza is meant to deliver a AR experience. The PS4 Eye Camera already delivers the AR experience for the PS4..
 
Wow go on vacation miss interesting thread

So is there a general consensus on whether this indicates a faster PS4 CPU clock, less Cores reserved on PS4, or some weird OS issue on XB1?
No consensus. The most plausible theory I've heard is the virtualization on XBone is reducing per-core CPU performance, to significantly below PS4 levels in the case of this particular software at least, and Matt confirmed you can "get more" out of the PS4's CPU.

Anyone know if the VM would have similar effect on GPU performance?
 
No consensus. The most plausible theory I've heard is the virtualization on XBone is reducing per-core CPU performance

Makes sense...install virtual machine on a PC and install the exact same OS within it, and run some benchmarks and you'll see a 5-10% drop in the results due to the overhead of running an OS within a virtual machine environment...just like the Xbox One is doing.


Lol...just about every decision MS have taken to turn this turd of a box into a media centre has handicapped it's gaming performance.
 

imt558

Banned
In the original Trueaudio thread.

So there's a separate Trueaudio chip in PS4, or a separate core on one of the dies for it? It doesn't utilise any of the existing CPU cores or GPU CUs?

There is a dedicated audio chip in PS4 ( just like Cerny said before ). Later came out info about TrueAudio DSP

ohhh.png
Did anyone else notice from those slides & info that the PS4 has a dedicated block for voice/speech recognition, echo-cancellation, & other enhancements like SHAPE on XBox1 does? It seems like ever since it was announced that the PS4's audio chip is based on AMD's Trueaudio DSP, all those advantages that SHAPE was going to give the Xbox1 over the PS4, disappeared.

Well, nice to hear. Didn't notice that.
 

Panajev2001a

GAF's Pleasant Genius
ohhh.png
Did anyone else notice from those slides & info that the PS4 has a dedicated block for voice/speech recognition, echo-cancellation, & other enhancements like SHAPE on XBox1 does? It seems like ever since it was announced that the PS4's audio chip is based on AMD's Trueaudio DSP, all those advantages that SHAPE was going to give the Xbox1 over the PS4, disappeared.

Re-reading it, yes... TrueAudio technology integrated into the ACP and not the GPU directly, moot point... same SoC ... same access to fast RAM. I guess I read too much in that presentation not using the TrueAudio name anywhere.
 

DonMigs85

Member
I think they're clocked independently, no?

I remember thuway said to not to rule out over clocks later on in the gen.

Hmm, REAL overclocking or just unlocking the full potential of the hardware, like when Sony officially allowed PSP software to use the full 333Mhz CPU/166Mhz GPU of the PSP?
 

vpance

Member
Hmm, REAL overclocking or just unlocking the full potential of the hardware, like when Sony officially allowed PSP software to use the full 333Mhz CPU/166Mhz GPU of the PSP?

Unlocking the full potential. As Coldfoot said, the chips must be tested to hit those speeds first before assembly, otherwise you'd get some bricked upon firmware update with an upclock. And the reason they shipped with a lower clock now might be lifespan or fan noise concerns.
 

coldfoot

Banned

That quote refers DIRECTLY to the chart I quoted from that same page. Anandtech DID NOT OVERCLOCK AMD Jaguars to come to that conclusion, they're just reading the chart and ignoring the GPU upclock. Here is what you said:
The 60% TDP increase comes from an Annandtech article where they overclocked a Jaguar core from the stock 1.6Ghz to 2Ghz.
Which is 100% false. At the time of that article, they did not even have a Jaguar to test. The actual Anandtech Jaguar test came months later, and no, they did not try overclocking.

Here's the chart once again:
 
Not strictly on topic, this doesn't have any info on CPU speed, but this presentation popped up recently:

http://research.scee.net/files/presentations/gceurope2013/ParisGC2013Final.pdf

Covers things like the GPU modifications and graphics libraries, the approach towards compute etc.

Interesting thanks for the link.

Nice to see that Sony decided that to allow particle effects etc to look their best, they understood that usually memory bandwidth/not high enough fill rate restricts this, so they used the largest memory bandwidth to best suit their needs and then made sure the fill rate exceeded the memory limits so as to maximise having that extra fast memory.

Slightly OT, but it's nice to read :)
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
What 14+4 myth?

One of the VGLeaks articles mentioned that the PS4's GPU is "balanced at 14+4", which many took as evidence for only 14 CUs being available for graphics, the rest being dedicated to compute.
 

Panajev2001a

GAF's Pleasant Genius
Top Bottom