• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

KeplerL2: PS6 Likely to Use 1TB SSD with No Disc Drive; Game Size Could Be Reduced with Neural Texture Compression

Compression is not new. And GPUs already do it.

Lets shut this kool aid narrative now. Kracken does 5.5 GB/s => Up to 9 GB/s. Aka that's how much it can decompress, 5.5 GBs every second with data that's compressed to 61% of original size at most.

you're asking for Kracken to now do the same thing, except now it has to decompress 640 GB/s (120x) …….

The CPU is roughly a 9600X with worse single core. But you basically sacrifice an entire core to do 6-7 GB/s of very fast decompression. (Way worse than Kracken's compression, by the way)

And even if it was possible to build such an ASIC. You can't add latency to memory without degrading GPU performance. Not on a GPU with so little cache as PS6.

This is just hocus locus stuff. Sorry.
I don't know what you are talking about... This is not Kracken 2.0... Kracken focused on data compression, particularly textures, in the drive, then decompressed it using the IO block just before it gets to RAM, but in RAM, the data is in its uncompressed state.

UC deals with all data, not just textures, and keeps it compressed even when its in RAM and only decompresses it just in time. Two very different technologies. And I dont know what you are talking about CPU core, UC is said to have a UC block, the same way we had an I/O block for Kraken, and also assisted by the neural arrays.

Its not rocket science, nor is it conjecture. Cerny literally made it one of three talking points during the amethyst thing... unless of course we are just gonna say Cerny is lying and doesn't know what he's talking about but you do.
Teraflops do matter, ram does matter. Computationally it's not a generational leap above the pro, I still stand by that, and so do a lot of people pushing back at the idea of it coming out next year. You can't just AI this shit away saying the fundamentals don't matter. Especially when we haven't seen rdna5 in full effect and this hardware will be the first gen of it.
No one said they don't, but those are no the things that will define hardware going forward. We are at a point where we have enough TF for where TF matters, and we have enough RAM. Especially considering that to push those things much further, the costs required would not translate to the gains. Hence why hardware has to now be built smarter. Everything Nvidia has been doing since 2019... has been about building smarter, not just building more.

And AI this shit away? Have you even just taken a cursory glance to see what's been going on in the graphics space over the last 4 years? There is no way, no how you brute force your way past what's coming.
 
I don't know what you are talking about... This is not Kracken 2.0... Kracken focused on data compression, particularly textures, in the drive, then decompressed it using the IO block just before it gets to RAM, but in RAM, the data is in its uncompressed state.

UC deals with all data, not just textures, and keeps it compressed even when its in RAM and only decompresses it just in time. Two very different technologies. And I dont know what you are talking about CPU core, UC is said to have a UC block, the same way we had an I/O block for Kraken, and also assisted by the neural arrays.

Its not rocket science, nor is it conjecture. Cerny literally made it one of three talking points during the amethyst thing... unless of course we are just gonna say Cerny is lying and doesn't know what he's talking about but you do.

No one said they don't, but those are no the things that will define hardware going forward. We are at a point where we have enough TF for where TF matters, and we have enough RAM. Especially considering that to push those things much further, the costs required would not translate to the gains. Hence why hardware has to now be built smarter. Everything Nvidia has been doing since 2019... has been about building smarter, not just building more.

And AI this shit away? Have you even just taken a cursory glance to see what's been going on in the graphics space over the last 4 years? There is no way, no how you brute force your way past what's coming.
Again, hardware wise its not much removed from the ps5pro and we don't know what RDNA5 is going to look like other than its going to be relatively new. Just like NVidia, whose first gen versions are lacking comparatively to what comes in the 2nd or 3rd iteration. Sony is going to drop it in on the first pass of RDNA5 and hope for the best? There are a lot of good reasons for waiting, but hey I'm not Sony, I am just expressing my opinion, that's it.
 
Compression is not new. And GPUs already do it.

Lets shut this kool aid narrative now. Kracken does 5.5 GB/s => Up to 9 GB/s. Aka that's how much it can decompress, 5.5 GBs every second with data that's compressed to 61% of original size at most.

you're asking for Kracken to now do the same thing, except now it has to decompress 640 GB/s (120x) …….

The CPU is roughly a 9600X with worse single core. But you basically sacrifice an entire core to do 6-7 GB/s of very fast decompression. (Way worse than Kracken's compression, by the way)

And even if it was possible to build such an ASIC. You can't add latency to memory without degrading GPU performance. Not on a GPU with so little cache as PS6.

This is just hocus locus stuff. Sorry.
With Oodle textures it's 5.5GB/s -> ~10GB/s (average, not up to). It's actually up to 22GB/s for some kind of data. I remember a spanish dev talking about it years ago in this forum.

And It should be possible to combine Kraken + Oodle + NTC.
 
I'm interested in the system, but I'll be wanting a disc drive. I suppose an attachment will be available if no SKU is offered.

As far as power, it looks to be pretty good actually. I'm more interested to see what the PSSR capabilities are with newer specifications.

PSSR2 was pretty impressive and they said there won't be any more iterations *this year*, so I'm excited to see what they can do.
 
I believe UC is just a next-gen version of what they did with kraken on the PS5. IV basically allows the PS6 to make 30 GB of RAM appear and function as 45GB of RAM. Numbers from my ass but in theory that's how its supposed to work. And yes, I agree that both can work together.
UC has absolutely nothing to do with Kraken. UC is an evolution of DCC and doesn't have any effect on game file sizes or VRAM utilization.
 
UC has absolutely nothing to do with Kraken. UC is an evolution of DCC and doesn't have any effect on game file sizes or VRAM utilization.
Good to know. I was just about to ask for evidence that it actually reduces VRAM utilization ie "compressed in RAM".

Any thoughts on what ratios we can expect from it though? Going from DCC to "any data type" is quite the jump and would require a complete rethink. And to do all that without introducing latency... the closest thing I can thing of is Blackwell DE, but it seems this is breaking new ground with data types supported, like geometry, BVH etc?
 
Last edited:
UC has absolutely nothing to do with Kraken. UC is an evolution of DCC and doesn't have any effect on game file sizes or VRAM utilization.
I clarified this in a later post saying UC is not kraken.

But I was of the impression that it compresses all data. Which naturally would mean that pre decompression u can fit more of that data in RAM
 
The more I read the more less interesting this is.
I can't remember feeling so underwhelmed by a console predecessor made by a brand I rock with.
 
Good to know. I was just about to ask for evidence that it actually reduces VRAM utilization ie "compressed in RAM".

Any thoughts on what ratios we can expect from it though? Going from DCC to "any data type" is quite the jump and would require a complete rethink. And to do all that without introducing latency... the closest thing I can thing of is Blackwell DE, but it seems this is breaking new ground with data types supported, like geometry, BVH etc?
Definitely lower ratios than DCC or they (or NVIDIA) would have done it already. But it makes sense to try to squeeze more performance/efficiency out of existing VRAM/Cache bandwidth since modern nodes have great logic scaling but very poor SRAM/analog scaling.
 
Definitely lower ratios than DCC or they (or NVIDIA) would have done it already. But it makes sense to try to squeeze more performance/efficiency out of existing VRAM/Cache bandwidth since modern nodes have great logic scaling but very poor SRAM/analog scaling.
Are you certain Rubin does not have something similar?

DCC is 2x or more right? That would be a tall order to match anyway. But why would Nvidia not go down this path even at 1.4 or 1.5X? Is it because they aren't as bandwidth starved as AMD would be with consoles? Still seems like a worthwhile exercise. It would be strange, albeit refreshing, for AMD to do something as a first to market.

But I was of the impression that it compresses all data. Which naturally would mean that pre decompression u can fit more of that data in RAM

Yeah that part wasn't going to be the case. Keeping it compressed in RAM is a whole different can of worms that isn't the GPU's job. This entire approach appears to be "in transit" i.e. gains only in effective bandwidth.
 
Last edited:
Are you certain Rubin does not have something similar?
Maybe? At least for the information released for datacenter version of Rubin they didn't mention anything similar to UC.
DCC is 2x or more right?
Can be as high as 8x but usually closer to 2x IIRC
But why would Nvidia not go down this path even at 1.4 or 1.5X? Is it because they aren't as bandwidth starved as AMD would be with consoles? Still seems like a worthwhile exercise. It would be strange, albeit refreshing, for AMD to do something as a first to market.
They probably will but yeah AMD might have gotten there first.

Also I found the RDNA4 compression slide:

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd6b4667-39e3-453b-adf2-31e229f9b89e_1600x897.png


UC benefit on perf/efficiency will be higher than this since it affects more buffers/data within the GPU.
 
Maybe? At least for the information released for datacenter version of Rubin they didn't mention anything similar to UC.

Can be as high as 8x but usually closer to 2x IIRC

They probably will but yeah AMD might have gotten there first.

Also I found the RDNA4 compression slide:

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd6b4667-39e3-453b-adf2-31e229f9b89e_1600x897.png


UC benefit on perf/efficiency will be higher than this since it affects more buffers/data within the GPU.
Nice! I'll gladly take 1.4-1.5x. Free lunch, indeed!
 
And It should be possible to combine Kraken + Oodle + NTC.
Not really. Oodle Kraken is entirely built on traditional Block Compressed textures. NTC will represent a shift from BC as the decompression will happen within the GPU. If devs go the NTBC route, it would still make Sony's custom decompression blocks redundant as the transcoding to BC happens after the textures have already reached the GPU from the SSD.

I expect Oodle Kraken to be used for backwards compatibility and as an interim solution till the whole industry moves to NTC. If NTC takes over (I think it eventually will, but not at launch), traditional BC will become obsolete. Their custom I/O complex will still be leveraged to get the benefits of the SSD, but not the decompression blocks.
 
Last edited:
Not really. Oodle Kraken is entirely built on Block Compressed textures. NTC will represent a shift from BC as the decompression will happen within the GPU. If devs go the NTBC route, it would still make Sony's custom decompression blocks redundant as the transcoding to BC happens after the textures have already reached the GPU from the SSD.

I expect Oodle Kraken to be used for backwards compatibility and as an interim solution till the whole industry moves to NTC. If NTC takes over (I think it eventually will, but not at launch), BC will become obsolete. Their custom I/O complex will still be leveraged to get the benefits of the SSD, but not the decompression blocks.
AMD's research has been on NTBC so far: https://gpuopen.com/download/2024_NeuralTextureBCCompression.pdf
 
Not really. Oodle Kraken is entirely built on traditional Block Compressed textures. NTC will represent a shift from BC as the decompression will happen within the GPU. If devs go the NTBC route, it would still make Sony's custom decompression blocks redundant as the transcoding to BC happens after the textures have already reached the GPU from the SSD.

I expect Oodle Kraken to be used for backwards compatibility and as an interim solution till the whole industry moves to NTC. If NTC takes over (I think it eventually will, but not at launch), traditional BC will become obsolete. Their custom I/O complex will still be leveraged to get the benefits of the SSD, but not the decompression blocks.
I am not sure I understand you here. what I read back then is the uncompressed textures are directly loaded into Vram ready to use by the GPU. Can't they be used by the GPU for the NTC inference?
 
UC has absolutely nothing to do with Kraken. UC is an evolution of DCC and doesn't have any effect on game file sizes or VRAM utilization.
it's to deal with RDNA5 having no bandwidth. think about it.

PS6 : 52 CU, 10MB L2, 640 GB/s
9070: 56 CU, 08MB L2, 925 GB/s (effective with IC)

RDNA5 has basically ditched as much SRAM as humanly possible and replaced it with logic. and PS6 has both halved L2 and using 32Gbps (while rest of RDNA 5 is 36Gbps)

Much more aggressive memory compression (and RDNA4's memory compression was broken in many titles), and some other non public tricks for dealing with convergent shaders (probably to make up for L0/L1 cuts?).
 
Last edited:
Do not tell me we are back to the early PS2 days with sloped surfaces and blurryness 🤣.

Oh well, we will see. We need to do something to combat storage and memory space issues, just wish we did not have to hehe.

Bored Come On GIF


Pana

It's Nvidia

You realize just how many papers and research they generate in the field they basically own? It's either wilful ignorance or borderline cute to think they didn't think of all this.

 
I am not sure I understand you here. what I read back then is the uncompressed textures are directly loaded into Vram ready to use by the GPU. Can't they be used by the GPU for the NTC inference?
Let me explain:

Here's how PS5 Kraken works:

1) SSD has Oodle compressed BC textures
2) Texture fetched from SSD via Sony's custom I/O complex
3) Texture decompressed by custom decompression block to standard BCn encode and sent directly to VRAM. This can be consumed by the GPU directly


Here's how NTC on sample/demand would work on PS6 (based on how it currently works on Nvidia or Intel):
1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) .ntc is sent directly to VRAM
4) During render time, GPU picks up .ntc from VRAM, performs inference within tensor core/neural array and renders texture directly using stochastic filtering or Ubisoft-style pre-filtering

There is no standard BCn texture in the above NTC method. So Kraken/Oodle has no role to play between step 2 and 3. It isn't built for that format.

Now let's take the NTC Inference on load method on PS6 (based on how it currently works on Nvidia or Intel):

1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) (EDIT) .ntc is sent directly to VRAM. During game load, perform inference within tensor core/neural array and transcode to standard BCn
4) Store BCn back in VRAM. GPU uses this directly.

AMD's NTBC method is similar to the above inference on load, with the only difference being the format of storage is itself Block Compressed. So let's call it ".ntbc". Not to be confused with standard BCn textures

Because the BCn exists only after step 3 for inference on load, I believe there is no role for Kraken as the I/O complex has already done its job of transferring .ntc (or .ntbc) file as-is.

Kraken does not understand .ntc or .ntbc. It only understands BCn.

There could be the possibility of compressing AMD's NTBC further using an Oodle-like technique, but that would be a pointless exercise imo. It would be like compressing .zip again with .gz. You aren't going to get much out of compressing something that is already compressed to the max.

Sorry for the essay. Hope that makes sense.
 
Last edited:
Not having any game discs out there fueling a shady grey market could actually help them get some revenue going. PS5 used game sales from GameStop would go straight to their bank account. Ebay etc. I think we've reached the threshold where you gain more than you lose by them going all digital. A few people will drop out, but simply not dealing with used game sales makes up for that fairly quickly I think. The PS6 handheld already wouldn't be able to use discs unless they make it a true standalone disc drive. I wish Sony was still that cool. Make it all nice like the FiiO joints.
 
Last edited:
Thanks for this, I thought I was losing my mind for a bit there, or that I was somehow confusing everything I thought I knew.

However...
Let me explain:

Here's how PS5 Kraken works:

1) SSD has Oodle compressed BC textures
2) Texture fetched from SSD via Sony's custom I/O complex
3) Texture decompressed by custom decompression block to standard BCn encode and sent directly to VRAM. This can be consumed by the GPU directly
So, where does UC fall into this? Because as I understand it, UC is not kracken, so it has nothing to do with the I/O complex... As I understood it, UC has its own dedicated UC blocks supported by the Neural arrays, and UC works on everything, not just textures.
Here's how NTC on sample/demand would work on PS6 (based on how it currently works on Nvidia or Intel):
1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) .ntc is sent directly to VRAM
4) During render time, GPU picks up .ntc from VRAM, performs inference within tensor core/neural array and renders texture directly using stochastic filtering or Ubisoft-style pre-filtering

There is no standard BCn texture in the above NTC method. So Kraken/Oodle has no role to play between step 2 and 3. It isn't built for that format.

Now let's take the NTC Inference on load method on PS6 (based on how it currently works on Nvidia or Intel):

1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) Before storing in VRAM, perform inference within tensor core/neural array and transcode to standard BCn
4) Store BCn in VRAM. GPU uses this directly.
At least I get all this.
AMD's NTBC method is similar to the above inference on load, with the only difference being the format of storage is itself Block Compressed. So let's call it ".ntbc". Not to be confused with standard BCn textures

Because the BCn exists only after step 3 for inference on load, I believe there is no role for Kraken as the I/O complex has already done its job of transferring .ntc (or .ntbc) file as-is.

Kraken does not understand .ntc or .ntbc. It only understands BCn.

There could be the possibility of compressing AMD's NTBC further using an Oodle-like technique, but that would be a pointless exercise imo. It would be like compressing .zip again with .gz. You aren't going to get much out of compressing something that is already compressed to the max.
Again, where does Universal compression fit into all this, because if its its own thing, then can't it also be able to compress an .ntc file? Because when you think about it, a .ntc file isn't actually a compressed file, its just a file saved ina different format, one that only a neural network can read, And even when looking at inference on load, its not necessarily decompressing the file, its kinda rebuilding it.
Wonder if the PS5 Pro disc drive will be compatible.
Lol.
 
Thanks for this, I thought I was losing my mind for a bit there, or that I was somehow confusing everything I thought I knew.

However...

So, where does UC fall into this? Because as I understand it, UC is not kracken, so it has nothing to do with the I/O complex... As I understood it, UC has its own dedicated UC blocks supported by the Neural arrays, and UC works on everything, not just textures.

At least I get all this.

Again, where does Universal compression fit into all this, because if its its own thing, then can't it also be able to compress an .ntc file? Because when you think about it, a .ntc file isn't actually a compressed file, its just a file saved ina different format, one that only a neural network can read, And even when looking at inference on load, its not necessarily decompressing the file, its kinda rebuilding it.

Lol.
UC compresses all the buffers and data *within* the GPU. It's not a file compression format.
 
Why would most people say "finally" to options being taken away?
Fanboy delusions and overactive imaginations.

Worth remembering, at ps5 launch there was the digital and disc drive models. Those digital units were forever excluded from discs.

They then released v2 models with optional disc drive, from that point you could upgrade the v2 DE version with the disc drive attachment.

And both DE and disc drive included models were sold alongside each other.

Now ps5 pro didn't include a disc drive based model at retail but I'd argue that with the premium priced unit Sony figured keep price shock to a minimum as $750 was a fair whack and those that wanted could get the $79 attachment.

In delusion land, Sony is abandoning discs.

In reality, they have actually embraced discs more with the release of v2 DE which was upgradeable to discs whereas launch DE wasn't.

And when we look at a market where we know the DE versus disc drive split, Japan, massive percentage have chosen the disc drive model, data courtesy of latest famitsu sales thread, napkin math about 75% are disc drive models?

YXSqSW9zi29PfMA5.jpg
 
Last edited:
Oooooooohhhh... I think I get it now.
The .ntc is already quite tiny and in a binary format. The latent texture is binary and the MLP is binary as well. Together they make up the .ntc.

I'm sure the inner working of all this is way more complex, but this is not my line of work, so I'm just trying to understand and describe it from a layman's perspective.

UC will kick in for everything moving between RAM and the WGPs/CUs. That's when VRAM bandwidth is used. So UC will affect all aspects of rendering. Geometry, animation, ray tracing, BVH, neural rendering, PSSR etc. It will try to compress textures too, after they are loaded to RAM.

So will that compress the .ntc when the WGP pulls it from RAM for inference? May be? No idea. Like I said, you don't get much out of compressing something that is already compressed.

But there should still be significant gains on other aspects. BVH and triangle geometry, for example are all uncompressed in memory by default. Compressed BVH is already an area of research. And geometry of the future like DGF essentially converts geometry into a BC-like format. May be DGF will get further compressed in transit? Don't know. Until there is more literature, it's all speculation.
 
In delusion land, Sony is abandoning discs.

In reality, they have actually embraced discs more with the release of v2 DE which was upgradeable to discs whereas launch DE wasn't.

And when we look at a market where we know the DE versus disc drive split, Japan, massive percentage have chosen the disc drive model, data courtesy of latest famitsu sales thread, napkin math about 75% are disc drive models?
Sorry to break it to you, but the only thing delusional is thinking that discs won't be abandoned at some point.

First to adresss what you said about people choosing the disc version, at launch, Sony was making and selling more disc SKUs simply because they lost less on those SKUs. The DEv2 allowing for a DD attachment, was a stopgap. Sony couldn't plant a flag in the ground and say we are getting rid of discs, so they made it optional.

But all that aside, the real harbinger here is the sales data. In 2025, 83% of all software units sold... were digital. With 17% being physical.

That's not something that anyone can ignore, even you. And if you track this split over the years, digital sales are obviously trending upwards. This shift is so absolute that physical media gamers are going to become the minority. This is why I believe there might not even be a disc drive SKU, even if it's a bundled DD SKY of the PS6. Outside retailer bundles, the ony way to get a DD for the PS6 would likely be buying it seperately. And if that turns out to be the case, you might as well kiss disc drives goodbye.
 
Sorry to break it to you, but the only thing delusional is thinking that discs won't be abandoned at some point.

First to adresss what you said about people choosing the disc version, at launch, Sony was making and selling more disc SKUs simply because they lost less on those SKUs. The DEv2 allowing for a DD attachment, was a stopgap. Sony couldn't plant a flag in the ground and say we are getting rid of discs, so they made it optional.

But all that aside, the real harbinger here is the sales data. In 2025, 83% of all software units sold... were digital. With 17% being physical.

That's not something that anyone can ignore, even you. And if you track this split over the years, digital sales are obviously trending upwards. This shift is so absolute that physical media gamers are going to become the minority. This is why I believe there might not even be a disc drive SKU, even if it's a bundled DD SKY of the PS6. Outside retailer bundles, the ony way to get a DD for the PS6 would likely be buying it seperately. And if that turns out to be the case, you might as well kiss disc drives goodbye.
What happens with ps7 maybe close to 10 years away who's to say. But I'm absolutely confident Sony won't be abandoning discs for ps6 generation.
 
Bored Come On GIF


Pana

It's Nvidia

You realize just how many papers and research they generate in the field they basically own? It's either wilful ignorance or borderline cute to think they didn't think of all this.

🤣, can you chill Buggy… one cannot even joke :P.

I like papers and technical presentations like everyone, but after UE5 tons and tons of promise and yet the result being one temporal accumulation artefact after the other… I remain cautious still.

Sure they are not nVIDIA fine :).

Will read the paper you linked, but you could help be less rubbing others' noses in it when you disagree and share news eh? I was just replying to the original quote where this was not making use of HW texture filtering let alone any trilinear or anisotropic filtering so it seemed to stand to reason that either the rendering cost for most mortals without expensive nVIDIA GPUs or the quality of texturing would take a backseat… especially making sacrifices to the altar of huge memory size reductions.

Edit: very interesting paper.
 
Last edited:
Let me explain:

Here's how PS5 Kraken works:

1) SSD has Oodle compressed BC textures
2) Texture fetched from SSD via Sony's custom I/O complex
3) Texture decompressed by custom decompression block to standard BCn encode and sent directly to VRAM. This can be consumed by the GPU directly


Here's how NTC on sample/demand would work on PS6 (based on how it currently works on Nvidia or Intel):
1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) .ntc is sent directly to VRAM
4) During render time, GPU picks up .ntc from VRAM, performs inference within tensor core/neural array and renders texture directly using stochastic filtering or Ubisoft-style pre-filtering

There is no standard BCn texture in the above NTC method. So Kraken/Oodle has no role to play between step 2 and 3. It isn't built for that format.

Now let's take the NTC Inference on load method on PS6 (based on how it currently works on Nvidia or Intel):

1) SSD has MLP and latent texture stored in a custom format. Say " .ntc"
2) .ntc fetched via Sony's custom I/O complex
3) Before storing in VRAM, perform inference within tensor core/neural array and transcode to standard BCn
4) Store BCn in VRAM. GPU uses this directly.

AMD's NTBC method is similar to the above inference on load, with the only difference being the format of storage is itself Block Compressed. So let's call it ".ntbc". Not to be confused with standard BCn textures

Because the BCn exists only after step 3 for inference on load, I believe there is no role for Kraken as the I/O complex has already done its job of transferring .ntc (or .ntbc) file as-is.

Kraken does not understand .ntc or .ntbc. It only understands BCn.

There could be the possibility of compressing AMD's NTBC further using an Oodle-like technique, but that would be a pointless exercise imo. It would be like compressing .zip again with .gz. You aren't going to get much out of compressing something that is already compressed to the max.

Sorry for the essay. Hope that makes sense.
Thanks for explaining it to me. I just don't understand that part :

3) Before storing in VRAM, perform inference within tensor core/neural array and transcode to standard BCn
I thought PS5 I/O could only get the data to vram. So it must get there before the GPU does anything to it, right? But yes you are right using Oodle in any part of that pipeline seems useless, even if that was possible.

My main problem with NTC / NTBC is the performance cost of doing it during gameplay which was the whole point of PS5 I/O complex: loading and uncompressing assets without any CPU or GPU performance cost. But here we are talking about doing inference in real-time, which could be done on PC powerful GPUs, but not sure how it would impact performance on much weaker console GPUs.
 
Last edited:
Even if that's true, I can't see NTC becoming a standard by launch.
It doesn't need to be - consoles have been at the forefront of adopting non-standard data layouts for - basically their entire existence - that used to be one of the defining advantages of the medium over PCs in particular.
And yes we lost that edge somewhere along the way, but nothing says it can't be revitalised if something truly game changing was available.
That being said - people grossly overestimate the impact of texture compression - from NVidia's own work at best we're talking a 2-3x improvement to the traditional pipeline when fine-tuned optimisations are used on both sides (not just on one to make the other look good/bad). Which don't get me wrong - is meaningful - but it's not going to turn 300GB games into 50GB. You might get 200GB, on a good day.

It's such a fundamental change to how textures are optimized and stored.
Eh - again not the first time we've seen consoles do something like this. Hell even going back as far as PS1 - it was the consoles that first introduced concept of compressed data on discs that was runtime unpacked (not by some installer process), saving memory and improving load-times in the process. Not to even mention the crazy lossy packing used to fit things onto N64 carts, or even older 2d era hardware.

But again, I just don't think this is the magic bullet nearly as much as GPU corpos want you to believe.
 
What happens with ps7 maybe close to 10 years away who's to say. But I'm absolutely confident Sony won't be abandoning discs for ps6 generation.
I don't think they will either.... but I believe a lot will become different.

Eg. I fully expect Sony to follow Nintendo's lead and start pricing digital games lower than their physical counterparts. That situation where games release digitally first, then months later a physical version releases will get worse. So far, Black Myth, Alan Wake, Space Marine, etc. have done it. u At some point, I believe the only physical version released will be the collector's edition.

Its honestly one of those no brainer type things, physical media is an antiquated format that serves no real purpose outside of allowing consumers be able to sell their games. We aren't even playing games off the disc lol. Like seriously, that's the only use it has right now. Because we can now even copy games to external hard drives for cold storage.
 
My main problem with NTC / NTBC is the performance cost of doing it during gameplay which was the whole point of PS5 I/O complex: loading and uncompressing assets without any CPU or GPU performance cost. But here we are talking about doing inference in real-time, which could be done on PC powerful GPUs, but not sure how it would impact performance on much weaker console GPUs.
The reason all this compression and texture things are going on is quite simply because textures take up the largest chunk of RAM in at any given time. And while RAM size increases are slowing, things are coming up that will also want their own healthy chunk of that RAM. Eg, RT or worse Path tracing. The RAM occupancy reduction, and what that allows for overall for any engine, is kinda worth the 10-15% performance cost, but that cost is relative to how your game was running to begin with.

There is also the file size thing too, eg, imagine a standard BC texture is like, say, 30 MB. After Oodle and Ktaken compression, it can be stored on the PS6 SSD as a 15MB file. If this were an NTC file though, it would only take up like... 2MB. Again, too good to ignore.
 
🤣, can you chill Buggy… one cannot even joke :P.

I like papers and technical presentations like everyone, but after UE5 tons and tons of promise and yet the result being one temporal accumulation artefact after the other… I remain cautious still.

Sure they are not nVIDIA fine :).

Will read the paper you linked, but you could help be less rubbing others' noses in it when you disagree and share news eh? I was just replying to the original quote where this was not making use of HW texture filtering let alone any trilinear or anisotropic filtering so it seemed to stand to reason that either the rendering cost for most mortals without expensive nVIDIA GPUs or the quality of texturing would take a backseat… especially making sacrifices to the altar of huge memory size reductions.

Edit: very interesting paper.

I knew you were pulling my leg and i pulled yours lol

It is indeed an interesting paper

It reminds me of Valve's MSAA modification to attenuate its weakness for half life alyx
 
I don't think they will either.... but I believe a lot will become different.

Eg. I fully expect Sony to follow Nintendo's lead and start pricing digital games lower than their physical counterparts. That situation where games release digitally first, then months later a physical version releases will get worse. So far, Black Myth, Alan Wake, Space Marine, etc. have done it. u At some point, I believe the only physical version released will be the collector's edition.

Its honestly one of those no brainer type things, physical media is an antiquated format that serves no real purpose outside of allowing consumers be able to sell their games. We aren't even playing games off the disc lol. Like seriously, that's the only use it has right now. Because we can now even copy games to external hard drives for cold storage.
Alan Wake 2 initially was released as a digital only game, at $60 so $10 cheaper than regular full release. Apparently they changed their minds later on and released on disc as well.

Black Myth I think perhaps initially was going to be digital only and then with its popularity they decided on disc release as well.

Ultimately only a tiny fraction of all games released.

As for advantages of disc versus digital, being able to lend a game to a friend or my kids makes it an easy decision to keep buying physical. Even if it ends up being $70 versus a cheaper $60 digital.
 
Thanks for explaining it to me. I just don't understand that part :


I thought PS5 I/O could only get the data to vram. So it must get there before the GPU does anything to it, right? But yes you are right using Oodle in any part of that pipeline seems useless, even if that was possible.

My main problem with NTC / NTBC is the performance cost of doing it during gameplay which was the whole point of PS5 I/O complex: loading and uncompressing assets without any CPU or GPU performance cost. But here we are talking about doing inference in real-time, which could be done on PC powerful GPUs, but not sure how it would impact performance on much weaker console GPUs.
Inference on load has no performance cost, but it only reduces texture size on disk. Inference on sample reduces texture size on disk and RAM, but has a performance cost (and needs some work for texture filtering).

Most likely we will only see Inference on load this generation.
 
It doesn't need to be - consoles have been at the forefront of adopting non-standard data layouts for - basically their entire existence - that used to be one of the defining advantages of the medium over PCs in particular.
And yes we lost that edge somewhere along the way, but nothing says it can't be revitalised if something truly game changing was available.
That being said - people grossly overestimate the impact of texture compression - from NVidia's own work at best we're talking a 2-3x improvement to the traditional pipeline when fine-tuned optimisations are used on both sides (not just on one to make the other look good/bad). Which don't get me wrong - is meaningful - but it's not going to turn 300GB games into 50GB. You might get 200GB, on a good day.


Eh - again not the first time we've seen consoles do something like this. Hell even going back as far as PS1 - it was the consoles that first introduced concept of compressed data on discs that was runtime unpacked (not by some installer process), saving memory and improving load-times in the process. Not to even mention the crazy lossy packing used to fit things onto N64 carts, or even older 2d era hardware.

But again, I just don't think this is the magic bullet nearly as much as GPU corpos want you to believe.
I think we agree in principle. The reason I brought up the launch window was because we were talking about a reduction in SSD requirements. And if that doesn't happen across the board at launch, Pro users will likely have to downsize the number of games they maintained prior to the PS6.

Like you said, there is a lot of exaggeration and marketing fluff over the gains achieved. They are often ignoring the additional gain Oodle already brings to BCn, at least from a storage standpoint. And devs aren't going to rush to overhaul their pipeline over 2-3x gains. It can take over a minute to train these MLPs to compensate for any perceivable loss in detail. Per texture! Until that training time comes drastically down, or the workflow changes to accommodate that latency, realtime editing and previewing becomes a pain.

NTBC might be the easy button here, as they can use BC textures during dev till the very end before a final neural training and compression pass for game builds. Just like Oodle. But i feel there are still some dev training/comfort level roadblocks and performance unknowns that may affect early adoption. But all the gains in VRAM bandwidth are out the window along with building really complex materials that can't be reasonably achieved with BC. So I'm still not sure to what extent a halfway solution like that will take hold.

But since NTC will objectively be better in the near future on all fronts, I do see it pushing other standards gradually into obsolescence in the long run. In our land of diminishing returns, even 2-3x is too good to ignore. It's no magic bullet for sure. Just one more thing that replaces the old thing eventually.

Thanks for explaining it to me. I just don't understand that part :


I thought PS5 I/O could only get the data to vram. So it must get there before the GPU does anything to it, right? But yes you are right using Oodle in any part of that pipeline seems useless, even if that was possible.

My main problem with NTC / NTBC is the performance cost of doing it during gameplay which was the whole point of PS5 I/O complex: loading and uncompressing assets without any CPU or GPU performance cost. But here we are talking about doing inference in real-time, which could be done on PC powerful GPUs, but not sure how it would impact performance on much weaker console GPUs.

I think I overly simplified that part and misstated it as a result. Good catch. The data would go to the RAM first. Then picked up for inference during level load (or load initiated by the level streaming system) and then transcoded to BC before storing again in RAM for the remainder of its time in memory. As you can tell, that's an additional hop and has no gains on VRAM bandwidth or usage. It's a slight loss actually, but with the benefit of lowered SSD storage requirements and using standard BC in memory during gameplay i.e no bleeding edge filtering solutions needed.

But here we are talking about doing inference in real-time, which could be done on PC powerful GPUs, but not sure how it would impact performance on much weaker console GPUs.
This is where cooperative vectors and neural arrays come into play. The current PS5 pro AI cores are only used less than 10% of the frame time for PSSR. The rest of the time, it's just idling. It's the case in the PC space as well. AI cores have been heavily overprovisioned as there's no other way to get good performance during the upscaling stage. And register pressure and L1 stalls has really held back to what extent neural rendering becomes mainstream even in the PC space, where Nvidia and Intel already support cooperative vectors.

All of this, at least on paper, will change in the next few years. It actually seems like Neural Arrays will be more capable than Blackwell Tensor cores, as it has its own dedicated interconnect that's faster than L1 and a shared scratchpad memory that can dramatically reduce contention for registers. They can effectively run as a cluster to execute larger, more complex models. And it allows the AI cores to run in parallel throughout the frame time while the shader cores perform traditional render tasks, which mean all these additional neural rendering techniques can be added on with minimal overhead.

Inference on load has no performance cost, but it only reduces texture size on disk.

Technically, the inference itself still has a performance cost. It's just much smaller as it doesn't have to keep repeating it again. And then everything else would work just the way it always did, post-inference.

Most likely we will only see Inference on load this generation.
Why though? If Ubisoft could already roll out their (smarter imo) version of Inference on sampling in a reasonably sized game, I don't see why inference on sampling will not be adopted across the board mid-gen, especially with the new architecture that would effectively negate or trivialize the performance cost.
 
Last edited:
Inference on load has no performance cost, but it only reduces texture size on disk. Inference on sample reduces texture size on disk and RAM, but has a performance cost (and needs some work for texture filtering).

Most likely we will only see Inference on load this generation.

And inference on load you still save on bandwidth.

I still think they would leave both options up to devs.
 
And inference on load you still save on bandwidth.
Nope. There's actually a slight added cost to bandwidth for doing the inference on load. Where are you getting this from? I already explained why it's not the case earlier.
 
Last edited:
You save PCI bandwidth and use extra VRam bandwidth. Also that's PC only - on console there's 0 savings, only use.
Which is the same as BCN using any one of the LZ coders does incidentally (or PC using GPU decompression I/O thing - if they ever get it working again).
 
Last edited:
You save PCI bandwidth and use extra VRam bandwidth. Also that's PC only - on console there's 0 savings, only use.
Which is the same as BCN using any one of the LZ coders does incidentally.

Why would consoles have 0 savings? The link between SSD and the APU is still a limit.
 
Why would consoles have 0 savings? The link between SSD and the APU is still a limit.
Because (without specialised hw in I/O block) you'd write back to memory first and then unpack. So you're actually having a net-bandwidth loss compared to things that I/O block does with eg. Kraken.
Obviously flipside is that console bus isn't as problematic as PCI lanes but still.
 
Last edited:
Why would consoles have 0 savings? The link between SSD and the APU is still a limit.
What F Fafalada said.

There's no shortage of I/O bandwidth in consoles. It's already overprovisioned. So when we talk about bandwidth savings in consoles from NTC, the only thing that matters is VRAM bandwidth. And there's technically a net loss in VRAM bandwidth (albeit not a lot) for inference on load.
 
Last edited:
And inference on load you still save on bandwidth.

I still think they would leave both options up to devs.
You don't save on bandwidth with IOL, just saves disc space. It even comes at a "cost," though that part is weird.

Eg. if you are moving 10GB of BC textures into RAM, and that takes 1 second, with NTBC, those same number of textures might only come in at 3GB. Even faster to move 3GB into RAM, but now you have to convert them back to BC...ideally, this cost should be invissble to the end user... but its there.

The real standout benfit of NTBC is smaller game sizes for you to download and store in your SSD.

PCIe traffic and VRAM bandwidth are two totally different things. Notice in both BCn and NTBC the image is taking up the same size in RAM? PCIe traffic is just moving data from your ssd to RAM. Hence why I said your transfer speed is faster.
 
Last edited:
or PC using GPU decompression I/O thing - if they ever get it working again.
Wait... that never happened? Was it pulled off the roadmap? No wonder people still talk about wasting a CPU core on it.... I'm a console peasant, so I haven't been keeping track.
 
Wait... that never happened? Was it pulled off the roadmap? No wonder people still talk about wasting a CPU core on it.... I'm a console peasant, so I haven't been keeping track.
It didn't take because in reality, and unlike the PS5, GPU decompression is not free. You are basically asking the GPU shaders to not just render frames, but simultaneously decompress data coming into it. For event critical data, you can start to see how this becomes a problem, cause at that point you can no longer guarantee proper performance across different GPUs, unlike a PS5 that has a dedicated block that does this decompression independent of the GPU.

It works, just comes at a cost that makes it not worth it when you consider that all you are doing is saving disc space and boosting data transfer speed.
 
It didn't take because in reality, and unlike the PS5, GPU decompression is not free. You are basically asking the GPU shaders to not just render frames, but simultaneously decompress data coming into it. For event critical data, you can start to see how this becomes a problem, cause at that point you can no longer guarantee proper performance across different GPUs, unlike a PS5 that has a dedicated block that does this decompression independent of the GPU.

It works, just comes at a cost that makes it not worth it when you consider that all you are doing is saving disc space and boosting data transfer speed.
But it's still better than leaving it up to the CPU.... I guess that explains why Direct Storage itself hasn't caught on yet.
 
Top Bottom