• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Sony Patents possible PS5 Backward Compatibility method invented by Mark Cerny

LordOfChaos

Member
My favorite thing about Cell is that 15 years after its release and 6 years after its relevance ended, it can still create a flash of debate on command, lol.
 

onQ123

Member
Good to see this. So MS are working on this. Does this mean this is strictly their tech? So I’m hoping this is implemented in their next Xbox (and also hoping Sony have something similar for their PS5).

I wonder how demanding this is on the gpu.


Nvidia tech but they used DirectML & demo'ed it using Forza but I expect even more tech like this to find their way into Next Gen consoles & even the consoles on the market now.
 

TLZ

Banned
Nvidia tech but they used DirectML & demo'ed it using Forza but I expect even more tech like this to find their way into Next Gen consoles & even the consoles on the market now.
I truly hope so because this looks like the perfect solution for BC.
 

bitbydeath

Member
It's a broadband engine and even stamped as such on the PS3 chip, Cell isn't used solely as a CPU and can be found in electronics from other vendors beside Sony.
I think that's what he meant.

Yeah, this is true as Sony was originally going to make a Cell powered GPU as well.

Cell was quite amazing with what it could accomplish, makes me wonder if Sonys collaboration with AMD is to customise Navi with their learnings from Cell.

The results could be quite incredible if they went down that path.
 

LordOfChaos

Member
Yeah, this is true as Sony was originally going to make a Cell powered GPU as well.

Cell was quite amazing with what it could accomplish, makes me wonder if Sonys collaboration with AMD is to customise Navi with their learnings from Cell.

The results could be quite incredible if they went down that path.



Allegedly the 8 ACE's they pulled into the PS4 from a bit further down the AMD pipeline at the time was to help with some of what they were doing with SPE jobs. As you mentioned SIMD was one area where Jaguar wasn't always a step up from Cell, but GPUs do it better than either now.

I think any such customization would already be table stakes for AMD GPUs by now though, so I'm curious what other customization papa Cerny will be asking for. Can't think what else it would want to pull from the Cell at this point, GPGPU has largely superseded what it did well compared to traditional CPUs.
 
Last edited:

GermanZepp

Member
It would be awesome that, in the ps5 announce they go like.. Yeah, legacy backwards compatibility psone, two, three and four, $399 and drops the mic.
 

Ten_Fold

Member
Sony if they announce next year at E3 that the ps5 can play ps1-4 games, I’m 100 percent buying day one easily. I hate hooking up old systems anyways Lol.
 

TGO

Hype Train conductor. Works harder than it steams.
I hope this makes it's way to PS5, with the amount of Remasters and Remakes this gen it'll be nice to carry them over to next gen instead of going back to square one of only being able to play them on legacy hardware.
It also adds to the whole library catalogue of a new console.
 
Last edited:

dogen

Member
could you say more about the ps2's ps1 emulator? ps2's cpu always been compatible with ps1.

No, the PS2 included a PS1 CPU. It was only replaced later on with the slim model (with a power pc cpu) which ran an emulator. The PS2 GS was essentially a PS1 GPU on steroids and it most likely didn't take much work to convert instructions.
 
What do you mean Cell isn’t a CPU? Of course it is. The below article highlights the differences found.

ubisoft-cloth-simulation-ps4-vs-ps3.jpg


http://www.redgamingtech.com/ubisoft-gdc-presentation-of-ps4-x1-gpu-cpu-performance/
I'm sorry buddy, but you're badly misinformed (and you're not the only one).

As I said, Cell isn't even a CPU to begin with: http://vr-zone.com/articles/sony-cell-processor-really-tough-get-grips/63732.html

"The Cell processor was the precursor to today’s accelerated processing units, very much a heterogeneous architecture"

Comparing Jaguar (which is a CPU) to Cell SPEs is disingenuous to say the least... wanna compare it to PPE (which is actually a CPU) and tell us how it performs? :)

This is a fair comparison: http://media.redgamingtech.com/rgt-...s-xbox-one-vs-xbox-360-vs-ps3-cpu-and-gpu.jpg

My favorite thing about Cell is that 15 years after its release and 6 years after its relevance ended, it can still create a flash of debate on command, lol.
Oh, I remember people saying back in 2005 that it was the first "true" 8-core CPU. Talk about confusing regular CPU cores with dedicated SIMD engines.

AMD must be really dumb that it took them over a decade later to deliver a true 8-core CPU design with Zen, right? Even the FX series was a quad-core design with shared FPU.

As I said, most people are misinformed (and it's not entirely their fault)...
 
Last edited:

DeepEnigma

Gold Member
I'm sorry buddy, but you're badly misinformed (and you're not the only one).

As I said, Cell isn't even a CPU to begin with: http://vr-zone.com/articles/sony-cell-processor-really-tough-get-grips/63732.html

"The Cell processor was the precursor to today’s accelerated processing units, very much a heterogeneous architecture"

Comparing Jaguar (which is a CPU) to Cell SPEs is disingenuous to say the least... wanna compare it to PPE (which is actually a CPU) and tell us how it performs? :)

This is a fair comparison: http://media.redgamingtech.com/rgt-...s-xbox-one-vs-xbox-360-vs-ps3-cpu-and-gpu.jpg


Oh, I remember people saying back in 2005 that it was the first "true" 8-core CPU. Talk about confusing regular CPU cores with dedicated SIMD engines.

AMD must be really dumb that it took them over a decade later to deliver a true 8-core CPU design with Zen, right? Even the FX series was a quad-core design with shared FPU.

As I said, most people are misinformed (and it's not entirely their fault)...

Interesting information that I've heard in the past as well.

Do you mind breaking down the differences that makes a CPU a CPU and the Cell different, but yet functioning/serving as one, etc.?

Also, the Jaguar has 8 cores but you mentioned Zen is being the first true 8 core?
 
Last edited:
Do you mind breaking down the differences that makes a CPU a CPU and the Cell different, etc.?
A CPU excels at certain (serial) workloads. Cell SPEs are more analogous to GPU Compute Units. They're designed for parallel tasks like graphics, physics etc.

Cell as a CPU (PPE) is roughly 3 times slower than a modern x86 CPU at the same clock rate. You can see some benchmarks here: https://www.7-cpu.com/

Also this: http://crystaltips.typepad.com/wonderland/2005/03/burn_the_house_.html

"Gameplay code will get slower and harder to write on the next generation of consoles. Modern CPUs use out-of-order execution, which is there to make crappy code run fast. This was really good for the industry when it happened, although it annoyed many assembly language wizards in Sweden. Xenon and Cell are both in-order chips. What does this mean? It’s cheaper for them to do this. They can drop a lot of cores. One out-of-order core is about four times [did I catch that right? Alice] the size of an in-order core. What does this do to our code? It’s great for grinding on floating point, but for anything else it totally sucks. Rumours from people actually working on these chips – straight-line runs 1/3 to 1/10th the performance at the same clock speed. This sucks."

TL;DR: "Jaguar sux" meme is so 2013. Yes, it sucks compared to i5/i7 and even i3 when it comes to single-threaded performance (IPC/clocks), but it's a sizeable improvement compared to 7th gen PowerPC CPUs. As a CPU it was specifically made for multi-threading (most game engines took a while to adapt, Bethesda/Gamebryo is still shit). SIMD tasks are relegated to GPGPU/Compute shaders these days.

You will always be disappointed if you compare consoles to PCs... PCs will always be able to afford a higher thermal envelope. When consoles will have 8 Zen cores @ 3.2 GHz, PCs will have up to 16 Zen cores @ 4-5 GHz.

Also, the Jaguar has 8 cores but you mentioned Zen is being the first true 8 core?
Jaguar is a quad-core CPU for laptops (not tablets/smartphones, you won't find it there) that was specifically "customized" for consoles to have 8 cores (2 clusters of 4 cores) and a different memory controller (GDDR5). That's why they call it a "custom" CPU.

You could argue that Zen's design (2 CCX) resembles Jaguar's 2 clusters in a sense, but Zen is the first 8-core desktop CPU by AMD.
 
Last edited:

DeepEnigma

Gold Member
A CPU excels at certain (serial) workloads. Cell SPEs are more analogous to GPU Compute Units. They're designed for parallel tasks like graphics, physics etc.

Cell as a CPU (PPE) is roughly 3 times slower than a modern x86 CPU at the same clock rate. You can see some benchmarks here: https://www.7-cpu.com/

Also this: http://crystaltips.typepad.com/wonderland/2005/03/burn_the_house_.html

"Gameplay code will get slower and harder to write on the next generation of consoles. Modern CPUs use out-of-order execution, which is there to make crappy code run fast. This was really good for the industry when it happened, although it annoyed many assembly language wizards in Sweden. Xenon and Cell are both in-order chips. What does this mean? It’s cheaper for them to do this. They can drop a lot of cores. One out-of-order core is about four times [did I catch that right? Alice] the size of an in-order core. What does this do to our code? It’s great for grinding on floating point, but for anything else it totally sucks. Rumours from people actually working on these chips – straight-line runs 1/3 to 1/10th the performance at the same clock speed. This sucks."

TL;DR: "Jaguar sux" meme is so 2013. Yes, it sucks compared to i5/i7 and even i3 when it comes to single-threaded performance (IPC/clocks), but it's a sizeable improvement compared to 7th gen PowerPC CPUs. As a CPU it was specifically made for multi-threading (most game engines took a while to adapt, Bethesda/Gamebryo is still shit). SIMD tasks are relegated to GPGPU/Compute shaders these days.

You will always be disappointed if you compare consoles to PCs... PCs will always be able to afford a higher thermal envelope. When consoles will have 8 Zen cores @ 3.2 GHz, PCs will have up to 16 Zen cores @ 4-5 GHz.


Jaguar is a quad-core CPU for laptops (not tablets/smartphones, you won't find it there) that was specifically "customized" for consoles to have 8 cores (2 clusters of 4 cores) and a different memory controller (GDDR5). That's why they call it a "custom" CPU.

You could argue that Zen's design (2 CCX) resembles Jaguar's 2 clusters in a sense, but Zen is the first 8-core desktop CPU by AMD.

That all makes sense, thank you.
 

bitbydeath

Member
I'm sorry buddy, but you're badly misinformed (and you're not the only one).

As I said, Cell isn't even a CPU to begin with: http://vr-zone.com/articles/sony-cell-processor-really-tough-get-grips/63732.html

"The Cell processor was the precursor to today’s accelerated processing units, very much a heterogeneous architecture"

Comparing Jaguar (which is a CPU) to Cell SPEs is disingenuous to say the least... wanna compare it to PPE (which is actually a CPU) and tell us how it performs? :)

This is a fair comparison: http://media.redgamingtech.com/rgt-...s-xbox-one-vs-xbox-360-vs-ps3-cpu-and-gpu.jpg


Oh, I remember people saying back in 2005 that it was the first "true" 8-core CPU. Talk about confusing regular CPU cores with dedicated SIMD engines.

AMD must be really dumb that it took them over a decade later to deliver a true 8-core CPU design with Zen, right? Even the FX series was a quad-core design with shared FPU.

As I said, most people are misinformed (and it's not entirely their fault)...

That fair comparison still has Cell ahead of the PS4 CPU.

And isn’t the SPE’s and PPE all part of Cell?
Everything I’ve seen referencing Cell as being a CPU with those core components.

Eg.

https://en.m.wikipedia.org/wiki/PlayStation_3
 
That fair comparison still has Cell ahead of the PS4 CPU.

And isn’t the SPE’s and PPE all part of Cell?
Everything I’ve seen referencing Cell as being a CPU with those core components.
I think I've made my case.

Aren't GPU CUs all part of an AMD APU?

If you're going to bring up SPEs (Cell's only saving grace, since PPE was crap even for 2005 CPU standards), then why shouldn't I bring up GCN CUs?

playstation-4-vs-xbox-one-vs-xbox-360-vs-ps3-cpu-and-gpu.jpg


GCN CUs were specifically added to assist the CPU in SIMD tasks. If you're not using them in 2019, you're doing it wrong.

PS4 is the polar opposite of the PS3 compute paradigm: Cell SPEs assist the RSX GPU in certain workloads (like vertex shaders, since RSX only had 8 pipelines vs 24 for pixel shaders), while GCN CUs assist the Jaguar CPU in SIMD tasks.

Think about it for a second. :)
 

Sophist

Member
Oh, I remember people saying back in 2005 that it was the first "true" 8-core CPU. Talk about confusing regular CPU cores with dedicated SIMD engines.

The Cell is truly a 8-core cpu, just that one core has a different architecture than the others but they are all independent from each other. A SIMD unit is not independent, it's only a part of a core that executes some instructions from the program that is being executed by the said core. A SPE is not a part of the PPE, it has its own context and run its own dedicated program independently; it's truly a core on its own.
 
Last edited:
The Cell is truly a 8-core cpu, just that one core has a different architecture than the others but they are all independent from each other. A SIMD unit is not independent, it's only a part of a core that executes some instructions from the program that is being executed by the said core. A SPE is not a part of the PPE, it has its own context and run its own dedicated program independently; it's truly a core on its own.
I'm afraid that's not what Sony fanboys meant back in 2005... if you get my drift. :)

Your definition of a CPU core is a bit "loose" to say the least. According to your logic, the Radeon GPU on a PS4 has 18 "cores" and thus it's more powerful than the Jaguar.

This is not a proper comparison either... CPUs and GPUs (or DSPs/SIMD engines if you will) are being tailored for entirely different workloads. Apples and oranges, really.

Cell was a weak CPU with powerful SIMD engines. It was a crap CPU and an excellent vector processor.

How are Jaguar AMD APUs any different?

Btw, if you want to desperately maximize your core count for PR purposes, you may as well add these ones as well (audio DSP, UVD, VCE, Zlib, ARM CPU in the southbridge):

https://en.wikipedia.org/wiki/PlayStation_4_technical_specifications#Hardware_modules
 
Wait, didn't Sony fans all cry about how backwards compatibility is a joke and nobody needs it...? Now they are making a big buzz for doing what everybody has been doing for years, and the last word has a boner, even though he himself said it was pointless...??

What the hell am I reading?

If it is about Jim Ryan's comment (Sony UK guy), I'd ignore what he says.

He tends to get roasted by a Reviewer on the Metro Newspaper for the silly things he says.

You are right though. It's weird to see the contradictions.
 
Last edited:

Heimdall_Xtreme

Jim Ryan Fanclub's #1 Member
If Cerny invent is true... I can considered the Leonardo da Vinci of the gaming xD

I hope is true, but how about if i put a European PS2 or PS3 game?
 

LordOfChaos

Member
I'm afraid that's not what Sony fanboys meant back in 2005... if you get my drift. :)

Your definition of a CPU core is a bit "loose" to say the least. According to your logic, the Radeon GPU on a PS4 has 18 "cores" and thus it's more powerful than the Jaguar.

This is not a proper comparison either... CPUs and GPUs (or DSPs/SIMD engines if you will) are being tailored for entirely different workloads. Apples and oranges, really.

Cell was a weak CPU with powerful SIMD engines. It was a crap CPU and an excellent vector processor.

How are Jaguar AMD APUs any different?

Btw, if you want to desperately maximize your core count for PR purposes, you may as well add these ones as well (audio DSP, UVD, VCE, Zlib, ARM CPU in the southbridge):

https://en.wikipedia.org/wiki/PlayStation_4_technical_specifications#Hardware_modules


"What's a CPU core?"
hqdefault.jpg




AMD's Bulldozer architecture had two integer units and one FP unit per shared 'Module'. Was that one core or two? AMD said two, and traditionally they would be right as CPU core counts were based off INT units, but if you ran FP code it sure ran like one.

Did the PPUs have int? Check, albeit prioritized less than FP
cell-broadband-engine-project-cell-spes-2001-ibm-103-cell-spe-MYBTK2.jpg




Another defining factor is sometimes if a core has independent caches - the SPUs were weird here again with only a locally memory that was famously painstaking to micromanage, until better toolkits came along.

Access to main memory? "Sorta"! Requests from an SPU had to be passed from the SPU to the SPE memory flow controller to set up a DMA operation within the system address space.


Was it a core, was it not a core, I find the debate a bit irrelevant, but in my mind I see them as cores that cut out much of what makes modern processors fast (prefetching, caches) in favor of brute SIMD, but they also weren't just SIMD units in a vacuum.
 

azz0r

Banned
To those saying that BC doesn't matter: That USED to be true. That has drastically changed due to so many games being purchased on digital.

While, I'm not ready to call it a deal breaker, I'd definitely say it matters far more going from PS4 to PS5 than it did for PS3 to PS4.

What I expect will happen is that games will be more like apps how they are on phones, where developers will release one game that can scale between consoles generations and after so much time they can stop supporting an older model.
That would make me so happy. I think the problem might be that developers then lose out on the re-buy of the game.

But I hope I'm wrong.
 

Panajev2001a

GAF's Pleasant Genius
I just hate when the work of dozens (hundreds?) Of people is attributed to one individual. I just hate it.

You are correct, it would be wrong to assume that and to hide the effort of everyone involved. On the other side, it is also true that is quite easy to assume that Leads / Architects / <insert people at the top> are replaceable and useless and not pulling their weight as the labour force below is (but then we go in the more Marxist view of labour which is way OT).
 
Last edited:
"What's a CPU core?"

AMD's Bulldozer architecture had two integer units and one FP unit per shared 'Module'. Was that one core or two? AMD said two, and traditionally they would be right as CPU core counts were based off INT units, but if you ran FP code it sure ran like one.
Well, a jury will have to decide that: https://www.theregister.co.uk/2019/01/22/judge_green_lights_amd_core_lawsuit/

:)

That would make me so happy. I think the problem might be that developers then lose out on the re-buy of the game.

But I hope I'm wrong.
Judging by 360 BC games, this isn't a big issue. Most 360 games being sold these days are digital, since it's a bit difficult to find physical copies of old games (new or used).

Why wouldn't Sony want to sell PS1/PS2/PS3 games on the PS store if they can? Having a unified ecosystem will be a boon to their profits.
 

Sophist

Member
I'm afraid that's not what Sony fanboys meant back in 2005... if you get my drift. :)

Your definition of a CPU core is a bit "loose" to say the least. According to your logic, the Radeon GPU on a PS4 has 18 "cores" and thus it's more powerful than the Jaguar.

This is not a proper comparison either... CPUs and GPUs (or DSPs/SIMD engines if you will) are being tailored for entirely different workloads. Apples and oranges, really.

Cell was a weak CPU with powerful SIMD engines. It was a crap CPU and an excellent vector processor.

How are Jaguar AMD APUs any different?

Btw, if you want to desperately maximize your core count for PR purposes, you may as well add these ones as well (audio DSP, UVD, VCE, Zlib, ARM CPU in the southbridge):

https://en.wikipedia.org/wiki/PlayStation_4_technical_specifications#Hardware_modules

It's not loose at all. A SPE, while being designed for batch processing, is still able to run all kind of programs and has access to all devices. The SPEs are seen as logic cores by the operating system. For example, you could run a web server on one SPE and, at the same time, run a game server on another SPE. This is not the case with a GPU, the inner cores are invisible and not manageable.
 
It's not loose at all. A SPE, while being designed for batch processing, is still able to run all kind of programs and has access to all devices. The SPEs are seen as logic cores by the operating system. For example, you could run a web server on one SPE and, at the same time, run a game server on another SPE. This is not the case with a GPU, the inner cores are invisible and not manageable.
Can it run general purpose code at acceptable performance? PPE sucked at it, so...

You can also use your feet to draw a painting, but I wouldn't recommend it.

There's always a good and a bad tool for every job out there...
 

LordOfChaos

Member
Can it run general purpose code at acceptable performance? PPE sucked at it, so...

You can also use your feet to draw a painting, but I wouldn't recommend it.

There's always a good and a bad tool for every job out there...


If you provided hinting and carefully managed the memory flow, but otherwise it could run standard code. Again the answer is always a fuzzy "if" with Cell, which is why it remains so interesting to this day.

Digital Foundry: In our last interview you compared Xbox 360's CPU to Nehalem (first-gen Core architecture from Intel) in terms of performance. So how does PlayStation 3 stack up? And from a PC perspective, has CPU performance really moved on so much since Nehalem?
Oles Shishkovstov: It's difficult to compare such different architectures. SPUs are crazy fast running even ordinary C++ code, but they stall heavily on DMAs if you don't try hard to hide that latency.

https://www.eurogamer.net/articles/digitalfoundry-inside-metro-last-light
 
Last edited:
If you provided hinting and carefully managed the memory flow, but otherwise it could run standard code. Again the answer is always a fuzzy "if" with Cell, which is why it remains so interesting to this day.

https://www.eurogamer.net/articles/digitalfoundry-inside-metro-last-light
Trying hard to hide latency puts it more in the same category as GPUs IMHO. You don't have to try hard in a regular CPU design.

PPE had issues running branchy code, that even a Pentium 3 could run much better.
 

decisions

Member
Despite all of Sony’s wierdness lately, I’ll still believe in them as long as Mark Cerny is in a high position.
 

LordOfChaos

Member
Trying hard to hide latency puts it more in the same category as GPUs IMHO. You don't have to try hard in a regular CPU design.

PPE had issues running branchy code, that even a Pentium 3 could run much better.

Yet a GPU won't just run your standard CPU oriented C++ code well as long as you hide latency. They're addressed in completely different ways. An SPU would run unoptimized code poorly, a GPU wouldn't run it at all.

A CPU's
Code:
for(int i = 0; i < N; i++)
    c[i] = a[i] + b[i];

Becomes a GPUs
Code:
__global__ add(int *c, int *a, int*b, int N)
{
  int pos = (threadIdx.x + blockIdx.x)
  for(; i < N; pos +=blockDim.x)
      c[pos] = a[pos] + b[pos];
}


An SPU could run block 1, even if poorly without optimizing it further. The debate isn't if it was the best choice, but it's certainly still more of a GPU-ified CPU than a GPU imo.
 
Last edited:
Yet a GPU won't just run your standard CPU oriented C++ code well as long as you hide latency. They're addressed in completely different ways. An SPU would run unoptimized code poorly, a GPU wouldn't run it at all.

A CPU's
Code:
for(int i = 0; i < N; i++)
    c[i] = a[i] + b[i];

Becomes a GPUs
Code:
__global__ add(int *c, int *a, int*b, int N)
{
  int pos = (threadIdx.x + blockIdx.x)
  for(; i < N; pos +=blockDim.x)
      c[pos] = a[pos] + b[pos];
}

An SPU could run block 1, even if poorly without optimizing it further. The debate isn't if it was the best choice, but it's certainly still more of a GPU-ified CPU than a GPU imo.
What about modern (GP)GPUs?

Sophist said that "the inner cores are invisible and not manageable". I think that's only valid for the old GPU paradigm where you just used it for graphics and nothing else.

https://developer.nvidia.com/how-to-cuda-c-cpp

Is that the case with technologies like CUDA?
 

FranXico

Member
What about modern (GP)GPUs?

Sophist said that "the inner cores are invisible and not manageable". I think that's only valid for the old GPU paradigm where you just used it for graphics and nothing else.

https://developer.nvidia.com/how-to-cuda-c-cpp

Is that the case with technologies like CUDA?
You can use the GPU for a lot of things, and it will process the data very fast... but there's a cost to loading the data to the VRAM in split architectures.
I suppose this is far less of an issue in consoles with shared memory pool.
 
I first heard about this over on VGR. I’m guessing that we’ll at least get to play PS4 games on PS5, but I am honestly really doubtful about whether we’ll be able to play anything older than that. I really hope so :/
 

bitbydeath

Member
Can it run general purpose code at acceptable performance? PPE sucked at it, so...

You can also use your feet to draw a painting, but I wouldn't recommend it.

There's always a good and a bad tool for every job out there...

PPE was the instructor for the SPE’s though, they were not seperate in function.

Think of an Orchestra requiring a conductor to play music, without the conductor everything falls apart.
 
PPE was the instructor for the SPE’s though, they were not seperate in function.

Think of an Orchestra requiring a conductor to play music, without the conductor everything falls apart.
IKR? Jaguar and GCN CUs have a similar relationship... they're meant to assist each other, just in the opposite way compared to the PS3.
 

Meh3D

Member
This patent thing has snowballed thanks to some of the dumbest gaming news reporting ever. At least there is some interesting conversations around this.

What about modern (GP)GPUs?

Sophist said that "the inner cores are invisible and not manageable". I think that's only valid for the old GPU paradigm where you just used it for graphics and nothing else.

https://developer.nvidia.com/how-to-cuda-c-cpp

Is that the case with technologies like CUDA?

They still are for GPGPU with Apple, AMD, NVIDIA, Intel, ARM, Samsung (eventually) etc.... You don't want to be messing with that anyway. From a lower level, we don't want (trust) any user down there. Many of these inner workings are hard wired in and react/instruct/sync/launch etc... at many times faster than an any user can manage. (Except may Carmack.) This was also a welcome trade off when going from pixel + vertex shaders to unified shaders; user is removed from the process. >8]

Edit: When I say "user" I mean programmers. 8P
 
Last edited:
This patent thing has snowballed thanks to some of the dumbest gaming news reporting ever. At least there is some interesting conversations around this.



They still are for GPGPU with Apple, AMD, NVIDIA, Intel, ARM, Samsung (eventually) etc.... You don't want to be messing with that anyway. From a lower level, we don't want (trust) any user down there. Many of these inner workings are hard wired in and react/instruct/sync/launch etc... at many times faster than an any user can manage. (Except may Carmack.) This was also a welcome trade off when going from pixel + vertex shaders to unified shaders; user is removed from the process. >8]

Edit: When I say "user" I mean programmers. 8P
I'm not sure if you're talking about consoles specifically or other platforms like PC/mobile. Cell and semi-custom AMD APUs are best utilized on a closed platform. Open platforms tend to favor more generalized programming.

GCN allows you to go very deep if you want:

https://gpuopen.com/amdgcn-assembly/
https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

It can be argued if this is a good idea or not. Obviously it won't be a good idea if AMD abandons GCN in favor of something entirely new.
 
Last edited:
But people buy new systems to play new games, right? Nobody cares about backwards compatibility, I was told.

I don't know about you, but when I buy a new system, I really do expect for my current library to carry over.

I shouldn't have to rely on emulation to play games once the official hardware gives up the ghost.

I really only see people like Blackb0nd unironically saying stuff like that.
 

Sophist

Member
Can it run general purpose code at acceptable performance? PPE sucked at it, so...

If the program is small enough, it seems to be as fast as the PPE. The main problem is the memory, a SPE has no cache but a memory file of 256 kb that has to hold both the data and the program instructions. With a huge program, you would have to do a lot of DMA operations for the instructions alone (Super Meat Boy has more than 1000 kb of instructions, for example).


(https://www.csm.ornl.gov/SC2007/pres/Meredtih_Cell_Judy/Meredith_Cell_SC07.pdf)

What about modern (GP)GPUs?

They still don't expose their inner cores as logical cores to the system, they are still seen as a black box that you have to operate through a driver interface (OpenCL, Vulkan, ...). You can not, for example, tell core #0 to run this program and core #1 to run that program. To give a concrete example, if you could run Microsoft Windows on a Cell, you would see the SPE cores in the task manager.
 

Panajev2001a

GAF's Pleasant Genius
256 kb that has to hold both the data and the program instructions. With a huge program, you would have to do a lot of DMA operations for the instructions alone

True, but then again you would do prefetching manually (locality will likely hold true for SMB too :)) and try to split your work in 256 KB of instructions plus data and work across SPU’s. Counting a very large register file too you have similar memory as many general purpose cores of the time but you have to manage it manually... same about coherency across cores (messages could be sent from SPU to SPU with a dedicated interface). From what I recal hearing about way back when some programs could actually run faster on SPU due to the many idiosyncratic behaviours of the PPU (at least it was 3.2 GHz and supported multiple HW threads... let’s give it at least that).

Sure all of that sounds painful yet can deliver some amazing and deterministic results and the way you make it sing is the same way you think about scaling in most modern architectures: start from the data you work with, how to best split it in chunks you can work in parallel (instructions that come out of this are data too), and how to move it across the chip. If you have caches and HW prefetching a lot of the moving and synchronising part is done for you, but you still have to look at all the other constraints yourself (cache line size, size of the working set vs L’1 and L2+ caches, latency for cache misses and probability of misses, whether to lock part of cache as scratchpad or not, bandwidth to main memory and how to split the data and organise loads and writes to maximise it, etc...).
 

DiscoJer

Member
Isn't this really not a question of technical ability but simply Sony largely wanting to?

For instance, the Vita can play the whole PSP library. But you literally cannot play most of it because Sony has locked down most PSP games from being downloaded to the Vita via PSN. Only by hacking the Vita can you play them.
 
Last edited:

c0de

Member
To give a concrete example, if you could run Microsoft Windows on a Cell, you would see the SPE cores in the task manager.
Is that true, though? You could run Linux on the PS3 but did a "top" show the SPUs? I somehow doubt that.
Edit:
This is interesting:

On boot, you see two big Tuxes and 7 small ones but a "cat /proc/cpuinfo" only shows the PPC cores.
 
Last edited:

Panajev2001a

GAF's Pleasant Genius
Is that true, though? You could run Linux on the PS3 but did a "top" show the SPUs? I somehow doubt that.
Edit:
This is interesting:

On boot, you see two big Tuxes and 7 small ones but a "cat /proc/cpuinfo" only shows the PPC cores.


The small penguins is likely due to a patch for the Kernel to see the extra cores: Linux is generally built for homogenous cores.
 

Sophist

Member
Is that true, though? You could run Linux on the PS3 but did a "top" show the SPUs? I somehow doubt that.
Edit:
This is interesting:

On boot, you see two big Tuxes and 7 small ones but a "cat /proc/cpuinfo" only shows the PPC cores.


The SPEs are not exposed in /proc/ but in their own virtual filesystem: spufs. Linux supports execution of SPE standalone programs.
 

Breakage

Member
I think at the very least the PS5 needs to have PS4 BC. The idea that all the software that PS users have purchased this gen will be incompatible with the next PlayStation just seems absurd in this day and age.
 
Top Bottom