Support NeoGAF

Negotiator · Dec 1, 2015

onQ123 said:
You do know that GCN CU's could actually run a OS right?

Then explain why Sony needs at least 1 Jaguar CPU core to run the OS? You've just contradicted yourself.

The GPU is more powerful, so according to your logic it would make sense to free all 8 Jaguar cores and reserve maybe 1 CU to run the OS. Too bad it's not that simple.

SIMD cores are specialized and they are very bad at running branchy code (i.e. AI). Fact.

truth411 · Dec 1, 2015

tapantaola said:
I don't think you know what you're talking about.

GPU CUs are akin to Cell SPUs. That's what GPGPU is for. Plenty of devs had said this, including Mark Cerny and ICE Team programmers. Why are you surprised? Cell was an early example of an "APU" (PPE -> Jaguar, SPUs -> GPU Compute Units). It heavily influenced the industry.

It's ridiculous to claim that a fully-fledged OS can run solely on the SPU. I've never accused anyone of lying, so do me a favour and stop putting words in my mouth. I just said that Sony has never elaborated on the PPE resource allocation. Yes, they've explained the SPU allocation, but that's it. You still need the PPE for more generic CPU-oriented tasks. Fact.

Calm down, read what I've said carefully (don't twist my words again) and please check the facts.

That's what Sony said, and your saying that's false. Draw your own conclusions on what that means, In any case if you want to disagree with facts go ahead, well just agree to disagree.

Edit: Also They did say the PPE and 6 spe are for game development. OS is dedicated to the 7th spe, they did say that.

ATOMICJORGE · Dec 1, 2015

truth411 said:
Honestly I think this will just help stabilize frame rates, don't expect miracles.

I know but still that's great news, a variable framerate is horrible all the games should have a locked framerate 30 or 60 it doesn't matter.

RoboPlato · Dec 1, 2015

dragonbane said:
That would be very weird if so.

Just guessing based off of the statements of devs in this thread and the sources Eurogamer has.

Negotiator · Dec 1, 2015

truth411 said:
That's what Sony said, and your saying that's false. Draw your own conclusions on what that means, In any case if you want to disagree with facts go ahead, well just agree to disagree.

Where has Sony said that? You got a source/link?

AFAIK, they utilized the 7th SPE to accelerate specific tasks like encryption/decryption (anti-piracy measures). I bet you don't even know what branch prediction is and you're here to claim that an SPU can run an entire OS.

Show me an OS than can run solely on an SPU/GPU and explain why they still need 1 Jaguar core for the OS. I'm waiting.

inpHilltr8r · Dec 1, 2015

PS3 PPU ran the OS and game in separate soft threads. 1 SPU was locked down for security. OS also used part of another SPU for audio.

Negotiator · Dec 1, 2015

inpHilltr8r said:
PS3 PPU ran the OS and game in separate soft threads. 1 SPU was locked down for security. OS also used part of another SPU for audio.

Thank you. I think that's a sensible response. That's what I've been saying all along and some people misunderstood me.

ps: Are you a dev, if I may ask?

inpHilltr8r · Dec 1, 2015

slapnuts · Dec 1, 2015

Mr Swine said:
Laptop CPU, completely different from mobile CPU

DieH@rd said:
Just to paint the picture more completely, one Jaguar module [4 cores] spends 15W at default 1.6GHz cock. PS4 and Xbone have two of those modules each.

lukeskymac said:
Yes, we know Jaguar is older and on an older fab process, but we're also comparing a console to a phone, people.

I can tell them that until you're blue in the face and yet they will still want to believe their phone or tablet is more powerful, really, some people believe this.

truth411 · Dec 1, 2015

inpHilltr8r said:
PS3 PPU ran the OS and game in separate soft threads. 1 SPU was locked down for security. OS also used part of another SPU for audio.

Weeeellllp.... Lol.

androvsky · Dec 1, 2015

tapantaola said:
Where has Sony said that? You got a source/link?

AFAIK, they utilized the 7th SPE to accelerate specific tasks like encryption/decryption (anti-piracy measures). I bet you don't even know what branch prediction is and you're here to claim that an SPU can run an entire OS.

Show me an OS than can run solely on an SPU/GPU and explain why they still need 1 Jaguar core for the OS. I'm waiting.

Branch prediction isn't the killer there, the biggest problem is Cell SPUs can't access memory outside their local storage. All they can do is scheduled a DMA transfer, they were made that way to force devs to use efficient algorithms. A Cell running an OS would be like running a business with nothing but a firehose for communication. No page tables, no interrupt handler.

Negotiator · Dec 1, 2015

androvsky said:
Branch prediction isn't the killer there, the biggest problem is Cell SPUs can't access memory outside their local storage. All they can do is scheduled a DMA transfer, they were made that way to force devs to use efficient algorithms. A Cell running an OS would be like running a business with nothing but a firehose for communication. No page tables, no interrupt handler.

No disagreement here. The SPUs are optimized for SIMD tasks, just like the GPU CUs, therefore they're unsuitable for running an entire OS.

hydrophilic attack · Dec 1, 2015

This should help with framerates in CPU limited situations. Great news if true

EdReedFan20 · Dec 1, 2015

I wonder if this helps with the upcoming PS2 emulation at all.

Elandyll · Dec 1, 2015

Wtf are people talking about? Afaik the SPEs are simply not capable to run a full OS for the simple reason that the "Orchestra conductor" that assigns tasks to them is the PPE. SPEs are incapable of assigning themselves tasks, or assigning tasks to each other directly.
It's akin to say that the whole orchestra could run just fine with the cello having the baton would lead the entire orchestra behind him just fine while playing him/herself...

There's abig difference between saying that a co processor has the "capability" of running an OS (not even the case in reality here) and whether it would make any sense...

onQ123 · Dec 1, 2015

tapantaola said:
Then explain why Sony needs at least 1 Jaguar CPU core to run the OS? You've just contradicted yourself.

The GPU is more powerful, so according to your logic it would make sense to free all 8 Jaguar cores and reserve maybe 1 CU to run the OS. Too bad it's not that simple.

SIMD cores are specialized and they are very bad at running branchy code (i.e. AI). Fact.

Because you can doesn't mean you should, why would they waste a CU on the OS?

GCN CU are not just SIMD it can also be MIMD & SMT.

Negotiator · Dec 1, 2015

Elandyll said:
Wtf are people talking about? Afaik the SPEs are simply not capable to run a full OS for the simple reason that the "Orchestra conductor" that assigns tasks to them is the PPE. SPEs are incapable of assigning themselves tasks, or assigning tasks to each other directly.
It's akin to say that the whole orchestra could run just fine with the cello having the baton would lead the entire orchestra behind him just fine while playing him/herself...

There's abig difference between saying that a co processor has the "capability" of running an OS (not even the case in reality here) and whether it would make any sense...

Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.

In layman's terms, SPUs are like factory workers. They're numerous, they do one specific job and they're pretty good at it. They don't run the business though and that's why they need the "factory manager" (PPU) to "orchestrate" them.

onQ123 said:
Because you can doesn't mean you should, why would they waste a CU on the OS?

GCN CU are not just SIMD it can also be MIMD & SMT.

I'm still waiting for you to show me that GCN CU-powered OS (assuming it doesn't rely on traditional CPUs at all):

http://www.neogaf.com/forum/showpost.php?p=187364673&postcount=648

Don't you think it's dumb to buy both a CPU and a GPU? If a CPU is not really needed, then what's the point of spending extra money? PC gaming would be cheaper as well if that was the case.

This discussion reminds me of the "GPUs will replace CPUs" nonsense. You guys got a dev response, so I don't know why you keep insisting.

If you're wrong, just admit it and move on. I was wrong about the Vita expanded RAM allocation (77MB vs 109MB), because reddit misinformed me. I didn't keep insisting on it.

onQ123 · Dec 1, 2015

tapantaola said:
Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.

In layman's terms, SPUs are like factory workers. They're numerous, they do one specific job and they're pretty good at it. They don't run the business though and that's why they need the "factory manager" (PPU) to "orchestrate" them.

I'm still waiting for you to show me that GCN CU-powered OS (assuming that it doesn't rely on traditional CPUs at all):

http://www.neogaf.com/forum/showpost.php?p=187364673&postcount=648

Don't you think it's dumb to buy both a CPU and a GPU? If a CPU is not really needed, then what's the point of spending extra money? PC gaming would be cheaper as well if that was the case.

This discussion reminds me of the "GPUs will replace CPUs" nonsense. You guys got a dev response, so I don't know why you keep insisting.

If you're wrong, just admit it and move on. I was wrong about the Vita expanded RAM allocation (77MB vs 109MB), because reddit misinformed me. I didn't keep insisting on it.

No dev responded to me about GCN CU's being able to run an OS.

AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.

The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?

serversurfer · Dec 1, 2015

Blanquito said:
Hey, glad to see I'm not totally off base for once.

*bows*

biglittleps said:
Eurogamer did not guess and they have source who confirmed that its shared between OS and Game.

That's not what was said. Firelight makers of FMOD told Eurogamer that the core had been unlocked for developers, they had no idea what the OS reservation was, if any, and Razor was able to report both user and OS activity on Core6. They never said it was doing so; just that it could. Then Firelight independently told GamingBolt that the OS reservation was not dynamic, and GamingBolt then assumed the reservation was fixed at some non-zero value. Zoetis then confirmed that the OS reservation is indeed fixed, at 0%. Also, only first-party currently have access.

So that would seem to explain all of the confusion that EG and GB are getting. Basically, Firelight don't know squat. They're not first-party; they're a third-party developer of middleware used on the PS4. As such, they don't actually have direct access to the super-secret SDK and Core6. Basically, Sony just told them to enable access to Core6 from within FMOD, so first-party could continue using the middleware in their app. Firelight did so, and that was pretty much the end of their involvement with the unlocking. Firelight didn't weren't actually involved with the unlocking; they just tweaked their app to leverage it, but they can't even see how it works yet, because they can't access it themselves yet, being third-party.

androvsky · Dec 1, 2015

onQ123 said:
No dev responded to me about GCN CU's being able to run an OS.

AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.

The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?

Not needing an OS to hand off function pointers is different from actually running an OS. Can a GCN CU take system level interrupts? Can it talk to peripherals without going through the CPU? Can it take over system init from the UEFI?

inpHilltr8r · Dec 1, 2015

tapantaola said:
Exactly. That's why it doesn't make any sense to claim that the 7th SPU can run the entire PS3 OS by itself.

You could totally run an entire OS on an SPU though. Use DMA to page memory. It's possible the IO is part of the PPU, but you could handle that on a teeny tiny interrupt.

Be some serious old school voodoo shit though.

onQ123 · Dec 1, 2015

androvsky said:
Not needing an OS to hand off function pointers is different from actually running an OS. Can a GCN CU take system level interrupts? Can it talk to peripherals without going through the CPU? Can it take over system init from the UEFI?

If someone wrote that code for the GPGPU yes.

StevieP · Dec 1, 2015

onQ123 said:
If someone wrote that code for the GPGPU yes.

You do realize the CPU handles all of the talking, right?

DonMigs85 · Dec 1, 2015

Lol at some of these armchair hardware and programming experts here. GPUs are good with highly parallelizable code, but not branchy data.
Info below taken from http://superuser.com/questions/308771/why-are-we-still-using-cpus-instead-of-gpus

GPGPU is still a relatively new concept. GPUs were initially used for rendering graphics only; as technology advanced, the large number of cores in GPUs relative to CPUs was exploited by developing computational capabilities for GPUs so that they can process many parallel streams of data simultaneously, no matter what that data may be. While GPUs can have hundreds or even thousands of stream processors, they each run slower than a CPU core and have fewer features (even if they are Turing complete and can be programmed to run any program a CPU can run). Features missing from GPUs include interrupts and virtual memory, which are required to implement a modern operating system.

In other words, CPUs and GPUs have significantly different architectures that make them better suited to different tasks. A GPU can handle large amounts of data in many streams, performing relatively simple operations on them, but is ill-suited to heavy or complex processing on a single or few streams of data. A CPU is much faster on a per-core basis (in terms of instructions per second) and can perform complex operations on a single or few streams of data more easily, but cannot efficiently handle many streams simultaneously.

As a result, GPUs are not suited to handle tasks that do not significantly benefit from or cannot be parallelized, including many common consumer applications such as word processors. Furthermore, GPUs use a fundamentally different architecture; one would have to program an application specifically for a GPU for it to work, and significantly different techniques are required to program GPUs. These different techniques include new programming languages, modifications to existing languages, and new programming paradigms that are better suited to expressing a computation as a parallel operation to be performed by many stream processors. For more information on the techniques needed to program GPUs, see the Wikipedia articles on stream processing and parallel computing.

Modern GPUs are capable of performing vector operations and floating-point arithmetic, with the latest cards capable of manipulating double-precision floating-point numbers. Frameworks such as CUDA and OpenCL enable programs to be written for GPUs, and the nature of GPUs make them most suited to highly parallelizable operations, such as in scientific computing, where a series of specialized GPU compute cards can be a viable replacement for a small compute cluster as in NVIDIA Tesla Personal Supercomputers. Consumers with modern GPUs who are experienced with Folding@home can use them to contribute with GPU clients, which can perform protein folding simulations at very high speeds and contribute more work to the project (be sure to read the FAQs first, especially those related to GPUs). GPUs can also enable better physics simulation in video games using PhysX, accelerate video encoding and decoding, and perform other compute-intensive tasks. It is these types of tasks that GPUs are most suited to performing.

AMD is pioneering a processor design called the Accelerated Processing Unit (APU) which combines conventional x86 CPU cores with GPUs. This approach enables graphical performance vastly superior to motherboard-integrated graphics solutions (though no match for more expensive discrete GPUs), and allows for a compact, low-cost system with good multimedia performance without the need for a separate GPU. The latest Intel processors also offer on-chip integrated graphics, although competitive integrated GPU performance is currently limited to the few chips with Intel Iris Pro Graphics. As technology continues to advance, we will see an increasing degree of convergence of these once-separate parts. AMD envisions a future where the CPU and GPU are one, capable of seamlessly working together on the same task.

Nonetheless, many tasks performed by PC operating systems and applications are still better suited to CPUs, and much work is needed to accelerate a program using a GPU. Since so much existing software use the x86 architecture, and because GPUs require different programming techniques and are missing several important features needed for operating systems, a general transition from CPU to GPU for everyday computing is very difficult.

The_Dama · Dec 1, 2015

When Sony increased the CPU on the PSP to 333MHz or something like that for GoW, did it help other games too? Games that originally were programmed to 232MHz

onQ123 · Dec 1, 2015

StevieP said:
You do realize the CPU handles all of the talking, right?

"AMDs latest revolutionary processing architecture, Heterogeneous System Architecture (HSA), bridges the gap between CPU and GPU cores and delivers a new innovation called compute cores.
This groundbreaking technology allows CPU and GPU cores to speak the same language and share workloads and the same memory. With HSA, CPU and GPU cores are designed to work together in a single accelerated processing unit (APU) , creating a faster, more efficient and seamless way to accelerate applications while delivering great performance and rich entertainment.
Learn more about the benefits of HSA and compute cores"

DonMigs85 · Dec 1, 2015

onQ123 said:
"AMDs latest revolutionary processing architecture, Heterogeneous System Architecture (HSA), bridges the gap between CPU and GPU cores and delivers a new innovation called compute cores.
This groundbreaking technology allows CPU and GPU cores to speak the same language and share workloads and the same memory. With HSA, CPU and GPU cores are designed to work together in a single accelerated processing unit (APU) , creating a faster, more efficient and seamless way to accelerate applications while delivering great performance and rich entertainment.
Learn more about the benefits of HSA and compute cores"

The GPU can't do everything by itself. The x86 cores are still essential. That's the point you don't seem to get. There aren't even any purely CUDA or OpenCL operating systems at all.

lowhighkang_LHK · Dec 1, 2015

How much of an improvement is this in cpu usage? 10 percent?

Also, so reconfirmed that the entire 6th core is available for game processes.

serversurfer · Dec 1, 2015

onQ123 said:
AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.

That doesn't mean it can run an OS, nor does it mean it doesn't need an OS at all.

I dunno if you're familiar, but I've talked in the past about how hUMA and GPGPU can be used to boost performance on stuff like AI. Most AI code is branchy prediction stuff that runs really well on the CPU, but you also need to determine what the actor can see around them (visibility), and also how they plan to move from where they are to where they want to be (pathfinding). So you look around, and based on what you see, you decide where you'd like to be instead of where you are, and then you figure out how you're going to get there. (Ladder? Stairs?)

The problem is, the CPU is really really shitty at visibility and pathfinding. In fact, even though it makes up a fairly small portion of the total AI code, a CPU will spend about 90% of it's cycles just on visibility and pathfinding. The other 10% of your CPU time is what makes the actual decisions.

Enter the GPU, which happens to be super awesome at both visibility and pathfinding. So we ask the GPU what the actor can see, and then the CPU can make a decision on what the GPU spots. Not only do we figure what we see much faster overall, we've freed up 90% of the AI core. Now we can make decisions for ten times as many actors, for example. Or we can teach them to make far more complicated decisions.

Now let's add heterogeneous queuing to the mix. Without it, the CPU basically needs to say, "What can actor1 see? Okay, then how does he get to where he's headed now? Okay, how about actor2?" That wastes a lot of the CPU's precious time, and is unnecessary on GCN. Basically, the GPU already knows it needs to rapidly update visibility and pathfinding for every entry in the actors table. Thanks to hUMA, the very same table is shared between the GPU and CPU, so they don't to pass that back and forth either; either can directly access any data they need. That means the GPU just takes the first actor, casts rays out of its face, determines which object(s) of interest those rays intersect, sets the resulting array in the actor's canSee property, and moves on to the next actor. After that, the GPU can lookup each actor's currentLocation and desiredLocation, and write out a nice path for them to start following in the actors table. Hell, it can probably do the visibility and pathfinding simultaneously, actually, since they're not directly interdependent.

All of this occurs with no input from the CPU whatsoever. "But wait! All of the input is coming from the CPU!" Sure, but the GPU doesn't care nor even know if it's looking up data the CPU wrote moments ago, or reading from a script written a year ago. It's just reading two points in space and tracing a line between them through the environment, which it also has a record of. All the GPU knows is input comes from A; output goes in B and it just does that, again, and again, and again because that's what it does.

Meanwhile, from the CPU's perspective, everything is equally simple, but reversed. input comes from B; output goes in A

So, yeah, that's what hQ is all about. It doesn't change what the GPU is capable of; it just means the GPU already knows what it's supposed to be doing with the data, basically. Not only does that free up time on the CPU, it reduces interdependency, which helps you make your engine not just asynchronous but also largely autonomous. Ideally, you want each task to be completely independent from the others. That lets you do all kinds of neat tricks. For example, obviously, you want rendering done at 120 Hz in VR, but does your AI really need to make 120 decisions a second? I don't know that humans change their mind that quickly. Running your AI at 30 Hz or even 15 Hz would save you a lot of cycles, and seems like it would provide sufficient realism, especially if you're running more sophisticated scripts now, with all the time you saved offloading visibility and pathfinding.

onQ123 · Dec 1, 2015

DonMigs85 said:
Lol at some of these armchair hardware and programming experts here. GPUs are good with highly parallelizable code, but not branchy data.
Info below taken from http://superuser.com/questions/308771/why-are-we-still-using-cpus-instead-of-gpus

GPGPU is still a relatively new concept. GPUs were initially used for rendering graphics only; as technology advanced, the large number of cores in GPUs relative to CPUs was exploited by developing computational capabilities for GPUs so that they can process many parallel streams of data simultaneously, no matter what that data may be. While GPUs can have hundreds or even thousands of stream processors, they each run slower than a CPU core and have fewer features (even if they are Turing complete and can be programmed to run any program a CPU can run). Features missing from GPUs include interrupts and virtual memory, which are required to implement a modern operating system.

In other words, CPUs and GPUs have significantly different architectures that make them better suited to different tasks. A GPU can handle large amounts of data in many streams, performing relatively simple operations on them, but is ill-suited to heavy or complex processing on a single or few streams of data. A CPU is much faster on a per-core basis (in terms of instructions per second) and can perform complex operations on a single or few streams of data more easily, but cannot efficiently handle many streams simultaneously.

As a result, GPUs are not suited to handle tasks that do not significantly benefit from or cannot be parallelized, including many common consumer applications such as word processors. Furthermore, GPUs use a fundamentally different architecture; one would have to program an application specifically for a GPU for it to work, and significantly different techniques are required to program GPUs. These different techniques include new programming languages, modifications to existing languages, and new programming paradigms that are better suited to expressing a computation as a parallel operation to be performed by many stream processors. For more information on the techniques needed to program GPUs, see the Wikipedia articles on stream processing and parallel computing.

Modern GPUs are capable of performing vector operations and floating-point arithmetic, with the latest cards capable of manipulating double-precision floating-point numbers. Frameworks such as CUDA and OpenCL enable programs to be written for GPUs, and the nature of GPUs make them most suited to highly parallelizable operations, such as in scientific computing, where a series of specialized GPU compute cards can be a viable replacement for a small compute cluster as in NVIDIA Tesla Personal Supercomputers. Consumers with modern GPUs who are experienced with Folding@home can use them to contribute with GPU clients, which can perform protein folding simulations at very high speeds and contribute more work to the project (be sure to read the FAQs first, especially those related to GPUs). GPUs can also enable better physics simulation in video games using PhysX, accelerate video encoding and decoding, and perform other compute-intensive tasks. It is these types of tasks that GPUs are most suited to performing.

AMD is pioneering a processor design called the Accelerated Processing Unit (APU) which combines conventional x86 CPU cores with GPUs. This approach enables graphical performance vastly superior to motherboard-integrated graphics solutions (though no match for more expensive discrete GPUs), and allows for a compact, low-cost system with good multimedia performance without the need for a separate GPU. The latest Intel processors also offer on-chip integrated graphics, although competitive integrated GPU performance is currently limited to the few chips with Intel Iris Pro Graphics. As technology continues to advance, we will see an increasing degree of convergence of these once-separate parts. AMD envisions a future where the CPU and GPU are one, capable of seamlessly working together on the same task.

Nonetheless, many tasks performed by PC operating systems and applications are still better suited to CPUs, and much work is needed to accelerate a program using a GPU. Since so much existing software use the x86 architecture, and because GPUs require different programming techniques and are missing several important features needed for operating systems, a general transition from CPU to GPU for everyday computing is very difficult.

onQ123 · Dec 1, 2015

DonMigs85 said:
The GPU can't do everything by itself. The x86 cores are still essential. That's the point you don't seem to get. There aren't even any purely CUDA or OpenCL operating systems at all.

I know that CPUs are better at most OS tasks but my point was that it could be done & not that it should be done. at the end of the day both are GP processors.

serversurfer said:
That doesn't mean it can run an OS, nor does it mean it doesn't need an OS at all.

I dunno if you're familiar, but I've talked in the past about how hUMA and GPGPU can be used to boost performance on stuff like AI. Most AI code is branchy prediction stuff that runs really well on the CPU, but you also need to determine what the actor can see around them (visibility), and also how they plan to move from where they are to where they want to be (pathfinding). So you look around, and based on what you see, you decide where you'd like to be instead of where you are, and then you figure out how you're going to get there. (Ladder? Stairs?)

The problem is, the CPU is really really shitty at visibility and pathfinding. In fact, even though it makes up a fairly small portion of the total AI code, a CPU will spend about 90% of it's cycles just on visibility and pathfinding. The other 10% of your CPU time is what makes the actual decisions.

Enter the GPU, which happens to be super awesome at both visibility and pathfinding. So we ask the GPU what the actor can see, and then the CPU can make a decision on what the GPU spots. Not only do we figure what we see much faster overall, we've freed up 90% of the AI core. Now we can make decisions for ten times as many actors, for example. Or we can teach them to make far more complicated decisions.

Now let's add heterogeneous queuing to the mix. Without it, the CPU basically needs to say, "What can actor1 see? Okay, then how does he get to where he's headed now? Okay, how about actor2?" That wastes a lot of the CPU's precious time, and is unnecessary on GCN. Basically, the GPU already knows it needs to rapidly update visibility and pathfinding for every entry in the actors table. Thanks to hUMA, the very same table is shared between the GPU and CPU, so they don't to pass that back and forth either; either can directly access any data they need. That means the GPU just takes the first actor, casts rays out of its face, determines which object(s) of interest those rays intersect, sets the resulting array in the actor's canSee property, and moves on to the next actor. After that, the GPU can lookup each actor's currentLocation and desiredLocation, and write out a nice path for them to start following in the actors table. Hell, it can probably do the visibility and pathfinding simultaneously, actually, since they're not directly interdependent.

All of this occurs with no input from the CPU whatsoever. "But wait! All of the input is coming from the CPU!" Sure, but the GPU doesn't care nor even know if it's looking up data the CPU wrote moments ago, or reading from a script written a year ago. It's just reading two points in space and tracing a line between them through the environment, which it also has a record of. All the GPU knows is input comes from A; output goes in B and it just does that, again, and again, and again because that's what it does.

Meanwhile, from the CPU's perspective, everything is equally simple, but reversed. input comes from B; output goes in A

So, yeah, that's what hQ is all about. It doesn't change what the GPU is capable of; it just means the GPU already knows what it's supposed to be doing with the data, basically. Not only does that free up time on the CPU, it reduces interdependency, which helps you make your engine not just asynchronous but also largely autonomous. Ideally, you want each task to be completely independent from the others. That lets you do all kinds of neat tricks. For example, obviously, you want rendering done at 120 Hz in VR, but does your AI really need to make 120 decisions a second? I don't know that humans change their mind that quickly. Running your AI at 30 Hz or even 15 Hz would save you a lot of cycles, and seems like it would provide sufficient realism, especially if you're running more sophisticated scripts now, with all the time you saved offloading visibility and pathfinding.

What would keep the GPGPU from running the OS code if the instructions was written for it?

c0de · Dec 1, 2015

onQ123 said:
You do know that GCN CU's could actually run a OS right?

What? How? The GPU would still have to be instructed by the CPU. That's how it currently works. The GPU is there to help CPUs, not replacing them. They can execute a lot of things faster but don't live "on their own".

DieH@rd · Dec 1, 2015

The_Dama said:
When Sony increased the CPU on the PSP to 333MHz or something like that for GoW, did it help other games too? Games that originally were programmed to 232MHz

It helped games that were run on CFW PSPs.There users had the ability to force CPU clocks no matter what game was active.

onQ123 · Dec 1, 2015

c0de said:
What? How? The GPU would still have to be instructed by the CPU. That's how it currently works. The GPU is there to help CPUs, not replacing them. They can execute a lot of things faster but don't live "on their own".

The hQ paradigm, by contrast, upgrades the GPU from a mere adjunct to the CPU to a processor of equal status. In this design, a given application can generate tasks queues directly on the GPU without the CPU getting involved. Better still, the GPU can generate its own workload - AMD's example here is raytracing, where one GPU task may generate several more tasks in its execution and which would previously have needed the CPU and operating system to be involved in queuing said new tasks - and even push work into the CPU's task queue. Equally, the CPU can push work into the GPU task queue without operating system involvement - creating a bi-directional queueing system which dramatically reduces latency and allows applications to easily push jobs to whichever processor is most appropriate.

c0de · Dec 1, 2015

onQ123 said:
The hQ paradigm, by contrast, upgrades the GPU from a mere adjunct to the CPU to a processor of equal status. In this design, a given application can generate tasks queues directly on the GPU without the CPU getting involved. Better still, the GPU can generate its own workload - AMD's example here is raytracing, where one GPU task may generate several more tasks in its execution and which would previously have needed the CPU and operating system to be involved in queuing said new tasks - and even push work into the CPU's task queue. Equally, the CPU can push work into the GPU task queue without operating system involvement - creating a bi-directional queueing system which dramatically reduces latency and allows applications to easily push jobs to whichever processor is most appropriate.

And in which "environment" (os) do you want to "launch" the binary for the GPU? What this is talking about is an existing environment (os), not what you are thinking of when you say the os could run on the GPU. The tech itself is talking about hardware capabilities and not requirements for the os.

Jenotron · Dec 1, 2015

I'd hate to see the PS4's OS slow down because of this redistribution. I bought a Xbox One on Friday and even though my expectations were super low I'm still disappointed in how sluggish it operates. Its like running Android on a really crappy phone trying to do anything.

DieH@rd · Dec 1, 2015

Jenotron said:
I'd hate to see the PS4's OS slow down because of this redistribution. I bought a Xbox One on Friday and even though my expectations were super low I'm still disappointed in how sluggish it operates. Its like running Android on a really crappy phone trying to do anything.

And did you ever tried operating PS4 OS?

driver116 · Dec 1, 2015

I'm guessing the OS switches to a blocking single core mode when in game and back to dual core when the user goes back to the OS. Makes sense to free the extra core up when out of the GUI.

lherre · Dec 1, 2015

Ps4 7th core

2 modes: 10% or 50% depending on the mode used

DieH@rd · Dec 1, 2015

lherre said:
Ps4 7th core

2 modes: 10% or 50% depending on the mode used

I wonder what OS service is not used when dev takes 50% of that core for gaming.

ninjablade · Dec 1, 2015

lherre said:
Ps4 7th core

2 modes: 10% or 50% depending on the mode used

Similar to Xbone? Thanks for the info.

Paganmoon · Dec 1, 2015

So other vetted insider earlier (zoiost?) was wrong when he said full access to the core?

Lord Error · Dec 1, 2015

lherre said:
Ps4 7th core

2 modes: 10% or 50% depending on the mode used

10% for OS or for the game?

d9b · Dec 1, 2015

Could it be that some of the first party games are already utilising this advantage (Uncharted collection /last patch/ ?)

?

driver116 · Dec 1, 2015

DieH@rd said:
I wonder what OS service is not used when dev takes 50% of that core for gaming.

PS Camera?
Streaming?
Background Music?

Negotiator · Dec 1, 2015

onQ123 said:
No dev responded to me about GCN CU's being able to run an OS.

AMD's Heterogeneous Queuing (hQ) technology promises to unlock the true potential of APUs, allowing a GPU to queue its own work without the CPU or operating system getting in the way.

The GPU in the PS4 could run a OS if it had to but why would they run the OS on the GPU?

I'm afraid you're confusing Asynchronous Compute with running a full-blown OS.

A modern x86-64 processor has tons of features that a GPU doesn't have and it supports over 1000 different instructions. A GPU/SPU is specialized at certain tasks (mainly linear algebra/matrix multiplication). Its feature/instruction set is limited compared to a CPU and that's why they're able to process many Teraflops. It's a different design philosophy, depending on what you want to do with a limited transistor budget. A CPU is a jack of all trades and master at none, while the GPU/SPU is a specialized, streaming processor.

Kayant · Dec 1, 2015

Paganmoon said:
So other vetted insider earlier (zoiost?) was wrong when he said full access to the core?

Maybe there are two thing here I think

1. FMOD, Eurogamer's source(Which might be FMOD) and lherre(thanks for info as always) all point to it not being completely unlocked. Although they are all third party devs.

2. Zoetis says it's completely unlocked for first party devs atm and says this has been the case for about 2 months which lines about with what lherre said (2-3 months).

So hard to say atm.

nortonff · Dec 1, 2015

I bet the OS could run alot faster if we could hide/delete/organize everything we wanted.
Just folders for what we want right away and library for everything else.

Negotiator · Dec 1, 2015

onQ123 said:

Shaders have nothing to do with traditional x86/CPU code.

Shaders are made to process graphics, physics and parallelizable stuff in general.

lherre said:
Ps4 7th core

2 modes: 10% or 50% depending on the mode used

Well, that's a bummer compared to Xbone's 50-80% allocation.

serversurfer · Dec 1, 2015

onQ123 said:
What would keep the GPGPU from running the OS code if the instructions was written for it?

Well, apart from the fact that it's just not very good at it, potentially, missing circuitry. I don't know enough to say for sure, but I wouldn't be at all surprised if the GPU simply lacked the transistors required to perform vital operations, because why would you waste silicon on functionality nobody will ever call? So maybe it can run the entire OS and maybe it can't, but what difference does it really make if it's a terrible idea to begin with?

That's what I was trying to get across in my hQ explanation, actually. It's dumb to force your decision-maker to waste all of its time crunching numbers, but it's equally stupid to make your number-cruncher try to make decisions. That's the advantage of GCN. Everything is designed around the idea of letting the appropriate chip handle the appropriate function, and letting the to chips work as independently from one another as possible. Running the entire OS on the GPU is just as dumb as running the entire OS on the CPU.

Having information like visibility magically update for the CPU is a huge fucking win, so we should be talking about stuff like that, or what devs are gonna run on this extra CPU core, instead of bickering about moot points like running the OS entirely on the GPU. Yes?

Zoetis said:
full core for gaming

lherre said:
2 modes: 10% or 50% depending on the mode used

Fight!!

Support NeoGAF

Recent PS4 SDK update unlocked 7th CPU core for gaming

Member

Member

Member

I'd be in the dick

Member

Member

Member

Member

Junior Member

Member

Member

Member

Member

Member

Banned

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Banned

Banned

Member

Accurate

Banned

Banned

Member

Insane For Sony

Banned

Member

Member

Member

Hi, I'm nortonff. I spend my life going into threads to say that I don't care about the topic of the thread. It's a really good use of my time.

Member

Member

Similar threads