• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PlayStation 4 hUMA implementation and memory enhancements details - Vgleaks

USC-fan

Banned
In Flop difference:

PS2 vs Xbox
15.4Gflops

Xbox vs GC
11.1Gflops

XOne vs PS4
530 Gflops

I can't see how such a big number couldn't make a difference?
I like this better difference in flops gpu only before the xbone overclock.

Is greater than ps3(166 gflop) + wiiu (176 gflops) + x360(216gflops) + xbone (1,340 gflops ) < ps4(1.86 tflops )

*ps3 is estimated.
 

fritolay

Member
Yes, as I said, a highly simplistic comparison :p

RAM advantage aside, we haven't even touched the actual specifications of the GPU, nevermind the 'teraflops'. The PS4 GPU has an advantage in pretty much every way possible.

Xbox One:
1.31 TFLOPS
40.9 GTex/s
13.6 GPix/s
68GB/s DDR3
109GB/s eSRAM
16 ROPS
12 CUs (768 ALUs)

PS4:
1.84 TFLOPS (+40%)
57.6 GTex/s (+40%) - Texture Fill Rate
25.6 GPix/s (+90%) - Pixel Fill Rate
176GB/s GDDR5
32 ROPS
18 CUs (1152 ALUs) - Compute Units

Can someone post the same specs for the PS3 and XBOX 360?
 

IN&OUT

Banned
Well the problem is that PC ports use the extra oomph usually in the form of brute force. I'll try to explain (though last time I did durante yelled at me... is it obvious that i'm afraid of durante?)


When designing a game for a PC you aren't building to a specific system. You're building to an insanely wide array of systems. If you're building a Xbox 360 game you know exactly what to expect - each dvd drive is the same speed, each harddrive is the same speed, each GPU and CPU is the same, each has the same amount of ram, Etc. You get it. On the PC, I personally have a fairly modest 3.6ghz i7, 650ti, 8gbs of ram. Super PC gamer X has a better processor, a $1000 gpu, 16gbs of ram, etc. And then casual PC gamer Y has a laptop with a 2.2ghz i5, a year old mobile processor and 4gbs of ram. A developer wants the game to run on all of these systems.

The issue is that since a PC developer needs to worry about gamer Y's laptop that Gamer X's supercomputer isn't programmed to specifically. Things are generally just scaled up. If you have the extra processor speed/better ram/better HD you'll load much faster - if you have the better GPU you can turn on more effects. The games obviously look much, much better on Gamer X's super computer... but they don't look nearly as good as they would if the developer said "fuck gamer Y, fuck mortimer, i'm making a game specifically for Gamer X's system" and worked to the strengths as the system as a whole instead of the brute force of it just being faster.


That's why the modest, in terms of PC specs, consoles will make great looking games. The developers will learn these systems and work specifically on what they do well. Meanwhile there may be amazing features in my 650TI that never, ever get used because there just isn't a reason for a PC developer to hone in on that one card.


It may sound like I'm shitting on PC gaming but I'm not. PCs will always produce the best looking games because of their ability to scale upwards. Even though your graphics card won't have every trick inside it exploited like a console - it will still produce incredible graphics while it's relevant. By the time the PS4 launches my current PC will be two years old and it will run multiplats like BF4 better than the PS4 will.


This is why Crysis on the PC was such an amazing looking game though. It actually targeted high end systems at the time. For years afterwards when you got a PC the question was "ok but how well does it run crysis?" They programmed to the strengths of high end PCs at the time and it stayed in the category of amazing looking game for years because of it.


But back to your question... this isn't anything you can brute force. This doesn't make the CPU or GPU more powerful. It will take time to and expertise to exploit it. Most people, including Sony PR/Cerny, think it will be a couple of years atleast. But once these tools get worked into the SDK I think you will see them used to some extent in most games. But we are years away from that.

this post makes too much sense. you simplified PC to consoles differences while touching the pros and cons of each.

Thank you very much.
 
No. hUMA is a tech that AMD and others will be pushing over the coming years. It's believed that the Xbox One has something similar.


It's good news though. The less time the system needs to spend swapping memory the more time it can spend on other things. It's going to help games become more graphically rich and more complex as they become more and more efficient with the system. This happens with all hardware, but hUMA should multiply that effect. If this is as powerful and simple as it sounds... the difference between launch games and end of gen games will be the largest in console history. We'll look at Killzone and Forza and laugh that we thought they looked good.

yrOiz.gif
 
If this is as powerful and simple as it sounds... the difference between launch games and end of gen games will be the largest in console history. We'll look at Killzone and Forza and laugh that we thought they looked good.
And that's why this generation will last 7 years or more. Good for Sony though, looks like they finally got around to building a system that makes a lick of sense and is actually efficient. Probably due to Ken Kutaragi not being around to muck up the design with useless technology.
 
I like this better difference in flops gpu only before the xbone over.

Is greater than ps3(166 gflop) + wiiu (176 gflops) + x360(216gflops) + xbone (1,340 gflops ) < ps4(1.86 tflops )

*ps3 is estimated.

Holy Christ I had no idea the U was that much weaker than the 360.
 

Piggus

Member
Xbox fanboys are pretty much in denial that the PS4 is more powerful than the Xbox One is. I'm sure they were bragging about the Original Xbox being more powerful than the PS2 & how the Xbox 360 received a ton of better multiplats than the PS3 because of the PS3's cell architecture.

My, my, how the tables have turned.

They'll get a dose of reality soon enough. If they're buying a Bone for the games, fair enough. It has a nice lineup. I feel kinda bad for the people who are convinced Microsoft would never let Sony have a power advantage and think it's somehow more powerful. Like that dumbass who wrote up an entire article about the Xbone APU having a hidden second GPU or some shit based on the die size. Its hilarious.
 

Jachaos

Member
Holy Christ I had no idea the U was that much weaker than the 360.

It's not... the Wii U is an unknown at this point, or at least last I read, and FLOPS aren't everything. The Wii U is complicated to develop for, and that's a stupid thing to do because it makes getting 3rd party support even harder for Nintendo. But a fully optimized game for Wii U will look much nicer than a fully optimized game for 360. Hell you even only have to look at Most Wanted to see much improved lighting, PC-version textures and the system outputting to the GamePad simultaneously to know that it's more powerful than the 360. Look at Mario Kart 8 and look at kart racers on current-gen. It's clearly not going to compete against consoles coming in only a few months as far as pure visual performance goes, but it's not weaker than a 360.
 

USC-fan

Banned
Holy Christ I had no idea the U was that much weaker than the 360.

Its really not. Gflop are a bad measurement across gen of GPU. The reason they are good to compare xbone/ps4 is because they are based on the same design.

Wiiu gpu is stronger than x360. It built on a lot better/newer gpu design.

It's not... the Wii U is an unknown at this point, or at least last I read, and FLOPS aren't everything. The Wii U is complicated to develop for, and that's a stupid thing to do because it makes getting 3rd party support even harder for Nintendo. But a fully optimized game for Wii U will look much nicer than a fully optimized game for 360. Hell you even only have to look at Most Wanted to see much improved lighting, PC-version textures and the system outputting to the GamePad simultaneously to know that it's more powerful than the 360. Look at Mario Kart 8 and look at kart racers on current-gen. It's clearly not going to compete against consoles coming in only a few months as far as pure visual performance goes, but it's not weaker than a 360.
Most of the improvements comes from having more edram and more ram.
 

Jachaos

Member
Most of the improvements comes from having more edram and more ram.

Still, throwing around gflops like that comparing totally different consoles is not useful for comparing modern and different architectures. Also, I can't believe the 176GFLOPS number is in any way official. Last I read it was around 500, what happened to that ? Where is the 176GFLOPS number from ?
 

FeiRR

Banned
The gpu was tweaked quite a bit so it wasn't all that much weaker (but yeah the 360 one is better but with less tweaks). The problem was that coding for the ps3 was all custom stuff that was a waste of time. This is why Gaben bitched about it back in the day... you had to learn specifically how to make games take advantage of the ps3 hardware and none of that work translated over to the 360 or pc. The cell was a brilliant chip that that was a complete waste of time to learn.... unless you were first party.
I read somewhere, I think it was one of Cerny's interviews, that the knowledge the first-party programmers got isn't going to waste. Cell's SPUs are about parallelism and when I read this hUMA document (although my IT knowledge is somewhat limited), I think I understand what he meant. On one hand, PS4's CPU has 6 cores available to the programmer (just like 6 SPUs in the Cell). On the other, all that parallelism of hUMA is just the exact concept all those first-party devs have been trying to master for the last 8 years. We'll probably see the results of that approach sooner than we think. It sounds like The Order is already using some of those features for material deformation. Physical expansion of engines is what I'm waiting to see develop rapidly in the coming years.
 

USC-fan

Banned
Still, throwing around gflops like that comparing totally different consoles is not useful for comparing modern and different architectures. Also, I can't believe the 176GFLOPS number is in any way official. Last I read it was around 500, what happened to that ? Where is the 176GFLOPS number from ?

It's comes from wiiu only having 160 Alu. 500 flops number was just crazy fans. A lot more information in the gpu thread for wiiu.
 

antic604

Banned
I read somewhere, I think it was one of Cerny's interviews, that the knowledge the first-party programmers got isn't going to waste. Cell's SPUs are about parallelism and when I read this hUMA document (although my IT knowledge is somewhat limited), I think I understand what he meant. On one hand, PS4's CPU has 6 cores available to the programmer (just like 6 SPUs in the Cell). On the other, all that parallelism of hUMA is just the exact concept all those first-party devs have been trying to master for the last 8 years. We'll probably see the results of that approach sooner than we think. It sounds like The Order is already using some of those features for material deformation. Physical expansion of engines is what I'm waiting to see develop rapidly in the coming years.

Well, the steep learning curve of Cell was actually already beneficial this gen - just look at Dice's BF3 presentations or interviews with Metro or NFS:MW devs. They say that because they had to make the PS3 version perform well, they had to 'learn' how to jobify their workloads to be able to use the SPUs. Turns out, the experience gained that way was beneficial for remaining platforms, resulting in more optimal utilization of available resources, especially on PCs where they were simply brute-forcing things.

My opinion is that Kutaragi was actually right, but his vision was a bit ahead of time, resulting in lesser ports for PS3 in the early years and a lot of bitching from devs that had to shift their paradigm.
 
The PS4 GPU being heavily modified for Compute functions is what will ensure there is stacks of untapped future potential for the system in years to come.

While the implementation of hUMA in the design ensures developers have the ability to get an optimum of the power from the base system from the get go.

Right?
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
So what does that mean? Well, if our jetski on a non-hUMA product takes 20% of the available resources to render the water and the real time waves created by the jetskis - it may now (and i'm pulling a number of my ass here, but the % isn't the point) may take 10%. So that's 10% "extra" that they have. Once you add everything... the AI, the lighting, all of the animations, the graphical effects, particles, etc... you have more overhead to add more because of the cycles you saved on the water rendering using hUMA.

You are not saving cycles, you are saving memory bandwidth.

In an architecture with a unified, shared memory space and cache-coherent memory access, many processors (in this case CPU/GPU and probably other special-purpose units) can access the same memory location concurrently. In other systems like the 360, the memory space was divided into isolated memory areas, where one was exclusive to the CPU and the other exclusive to the GPU. If you wanted to share data between the two, the data had to be copied, thus costing bandwidth in the process. (It also introduces latency that might stall individual tasks, but this is hidden by task scheduling and context switching.) In hUMA-like architectures, there is no need for that.

To introduce the worst analogy of the day, UMA is like Berlin separated by the wall with the CPU being in the west and the GPU in the east, and every interaction has to pass the border control. With hUMA, the wall is torn down and everybody can roam the city freely.
 

KidBeta

Junior Member
looking from the games showed xb1 seem more powerfull than ps4 to me

If you want to compare the performance of the consoles you need to do it with multiplatform games really, wait for them to come out and have a look, comparing exclusives muddies the water a lot.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
Remember when we had 8, 16, 32, 64, 128bits? Yeah I miss that too.

The funny thing is, that the N64 almost never used its 64bit-fronted width. I have the N64 SDK here, and even in the documentation it is explicitly discouraged for performance reasons.
 
You are not saving cycles, you are saving memory bandwidth.

In an architecture with a unified, shared memory space and cache-coherent memory access, many processors (in this case CPU/GPU and probably other special-purpose units) can access the same memory location concurrently. In other systems like the 360, the memory space was divided into isolated memory areas, where one was exclusive to the CPU and the other exclusive to the GPU. If you wanted to share data between the two, the data had to be copied, thus costing bandwidth in the process. (It also introduces latency that might stall individual tasks, but this is hidden by task scheduling and context switching.) In hUMA-like architectures, there is no need for that.

To introduce the worst analogy of the day, UMA is like Berlin separated by the wall with the CPU being in the west and the GPU in the east, and every interaction has to pass the border control. With hUMA, the wall is torn down and everybody can roam the city freely.

You save cycles as well since you won't have to flush the entire cache so you avoid lots and lots of expensive cache misses.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
You save cycles as well since you won't have to flush the entire cache so you avoid lots and lots of expensive cache misses.

Not in general. That's why I mentioned task scheduling and context switching. It's the same with texture fetching. It too stalls the current task, but context switching hides that.

(You save some cycles because you don't need the management overhead for explicit CPU/GPU interaction anymore, but that should be negligible compared to processing time spend on the actual program.)
 

TheKayle

Banned
If you want to compare the performance of the consoles you need to do it with multiplatform games really, wait for them to come out and have a look, comparing exclusives muddies the water a lot.

clearly si :)


in fact im still waiting to see a noticable difference

but as always ill buy both console
 

KidBeta

Junior Member
Not in general. That's why I mentioned task scheduling and context switching. It's the same with texture fetching. It too stalls the current task, but context switching hides that.

(You save some cycles because you don't need the management overhead for explicit CPU/GPU interaction anymore, but that should be negligible compared to processing time spend on the actual program.)

You save both because if you do it the old way and invalidate all the cache then you are going to be waiting for something to come into memory before you can do anything.


i know knack is next gen as 30 fps driveclub and 720 bf4

what about being honest with yourself?


Where are the XBONE versions of these games? or are you trying to make a false comparison.
 

ElTorro

I wanted to dominate the living room. Then I took an ESRAM in the knee.
You save both because if you do it the old way and invalidate all the cache then you are going to be waiting for something to come into memory before you can do anything.

Not really, because in case of such stalls the GPU just switches execution contexts and does something else. It's the same with texture fetch operations. Those too stall the current task execution, but in this case the GPU just switches context. The individual task is stalled but the overall resource saturation can be maintaned. (As I said, some cycles for management logic are always lost, but this should be comparatively negligible.)

gpustallsmb12.png

(From http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf)
 

TheKayle

Banned
You save both because if you do it the old way and invalidate all the cache then you are going to be waiting for something to come into memory before you can do anything.





Where are the XBONE versions of these games? or are you trying to make a false comparison.

kidbeta you are right about "waiting multiplats" i respect ur point of view..and i really do it

but clearly ppl here are blind of fanboyism seeing in knack and a sub hd version with mid setting of bf4 ...as next gen games

i know that forza is 1080p@60fps
and honesly is more "beauty" than driveclub ...but i know the comparasion is bad..coz that is turn10...and forza is a bigger franchise...

but on the other hand like im reading in 800 thousands thread for a console that can eat wiiu+x360+xb1 ..flops all together (someone did this really stupid statement) would be reallly sad to dont see something that stand up over the rivals...(and ps4 dosnt have nothing that stand up) this is my point
 

KidBeta

Junior Member
kidbeta you are right about "waiting multiplats" i respect ur point of view..and i really do it

but clearly ppl here are blind of fanboyism seeing in knack and a sub hd version with mid setting of bf4 ...as next gen games

i know that forza is 1080p@60fps
and honesly is more "beauty" than driveclub ...but i know the comparasion is bad..coz that is turn10...and forza is a bigger franchise...

but on the other hand like im reading in 800 thousands thread for a console that can eat wiiu+x360+xb1 ..flops all together (someone did this really stupid statement) would be reallly sad to dont see something that stand up over the rivals...(and ps4 dosnt have nothing that stand up) this is my point

Beauty != Power

A game can have beauty and not require any power, or can require lots, you are taking something subject (how good you believe a game looks) and trying to extrapolate it to say that you believe there is no difference between the two because of this.

This is like me saying I like how a Volkswagen Beetle looks, therefore its as fast as the Ferrari.

And before you start comparing Forza to driveclub keep in mind that Forza has a lot more static effects (level shadowing?, lighting, time of day, etc) that drive club computes in real time.


What does bits mean in the context of adding them to something?

In the context of memory addressing / cache adding bits to something would refer to adding new bits to the hardware that represent specific things (such as wether or not to cache data from a specific address).
 

JawzPause

Member
I've been reading a lot of this stuff, and I am far from an expert, but I will give this a shot. This is my understanding, corrections are of course welcome.

So these two systems are very very similar. More comparable than any other two consoles in the same gen in history, I'd wager.

The Sony setup is very straightforward. It has high bandwidth access to a bunch of fast RAM. Most of the philosophy behind Ps4 is their own 180 from the Ps3. Nothing is particularly exotic or fancy. It's a workhorse and well provisioned. It sort of falls within the general philosophy of hUMA whichis to eliminate CPU/GPU bottlenecks. PS4 games will get up to speed quickly. New tricks will be discovered as time goes on because of the new communication abilities to be had between CPU and GPU.

The Xbone approach more closely resembles something like what Sony would have maybe done before. In order to use the much cheaper RAM they opted for, they are using some hUMA type techniques, notably a tiny chunk of fast eDRAM. This helps leverage the slower RAM into a much higher performance. But it's not as simple, it needs to be accounted for. Therefore that will have an impact.

Now much like Cell, hard software problems do get solved eventually, and performance increases, so we can also expect the Xbone performance to climb over time of course. But it's not quite as straightforward as the PS4 solution, and to my untrained eye it seems that you can see a bit of the compromise that Ms has opted to make, in the service of longevity/price/component suppliers/whatever.

But I still think the takeaway is that these boxes are more similar than not, and while the PS4 does indeed hold an edge, it still remains to be seen how that actually plays out in the wider market. All we know for sure is that Sony's first party will probably blow the doors off like they usually do but who knows what beyond that.
nice post, thanks for summarising.
 
Yep. The ps3 was more powerful but much harder to program for than the 360. The ps4 is more powerful and easier to program for than the xbox one. That's a pretty nice combo for Sony this time.

For this fact alone, if sony manages to gain an early lead on Microsoft, The logical thing will be to lead on PS4 and port for multiplats.
 

TheKayle

Banned
Beauty != Power

A game can have beauty and not require any power, or can require lots, you are taking something subject (how good you believe a game looks) and trying to extrapolate it to say that you believe there is no difference between the two because of this.

This is like me saying I like how a Volkswagen Beetle looks, therefore its as fast as the Ferrari.

And before you start comparing Forza to driveclub keep in mind that Forza has a lot more static effects (level shadowing?, lighting, time of day, etc) that drive club computes in real time.




In the context of memory addressing / cache adding bits to something would refer to adding new bits to the hardware that represent specific things (such as wether or not to cache data from a specific address).

with beauty i would say look better ..dont turn my words into something that i would never say im not talking about personal taste...

i can see clearly the models in driveclub are LOTS less complex than car models in forza....forza have...an advanced physics compared to dc ..if im not wrong i seen also crashes physics in some video? but no i dont want compare this games...driveclub is a new franchise...cant be compared to gt or forza

again im just pointing the fact that people here fill their mouth with words ..and then the reality is just different
 
Top Bottom