Damn this shit is moving very fast. Didn't think 24 Gbps would happen until GDDR7, and that is probably for 2023 - 2024.
Dunno if RDNA3 will bother with it though, actually. AMD is trending with lower main memory bandwidth offset by a lot of Infinity Cache on-die. Which for their design works out better as you have more data closer to the chip in a (relatively, for SRAM) large pool of memory with higher bandwidth and lower latency than GDDR6/6X etc.
Or, maybe they will use it, but in a way to get good bandwidth on VRAM and simplifying the memory controller set-up; i.e instead of needing 256-bit interface with 8x 16 Gbps GDDR6 chips for 512 GB/s, you can do 192-bit interface with 6x chips and get 576 GB/s, or you can just clock the chips a bit lower to get 512 GB/s.
You lose some capacity (12 GB vs. 16 GB), but have a simpler memory controller setup and a smaller card. It could work for a few different designs in the mid-end maybe.
Not for future GPU multiple times as fast as the current top end.
There's only so much SRAM you can fit on a die and as we go lower than 5nm costs will balloon further, meaning even less incentive to have large on-die caches.
IC is effectively a stop-gap solution, necessitated because memory bandwidth hasn't scaled nearly as well as transistor count on the processor cores. Once HBM becomes low enough cost in relative terms, IC will be quickly discarded for the TB/s worth of bandwidth and multiple tens of GB memory capacity that HBM can offer.
Possibly. But if IC is already doing so well for RDNA 2 in rasterization workloads to keep up (if not beat) Ampere in those, why get rid of it? They'll never fully discard IC IMO, and some of the patents and documentation (at least from sources I've read on talking about them) for V-Cache makes it sound like they will be implementing that into the GPU side as well, so they clearly see it as part of their long-term solution.
HBM prices will have to come down a bit more before we really see it back in commercial, consumer GPU designs, and even then it depends on which HBM we're talking about. HBM3 IIRC already has some samples testing at 6 Gbps or 7 Gbps, but you can bet those chips are going to go for a premium, too much a premium for even high-end consumer GPUs for the next 3-4 years I'd say. Plus while HBM has lower latency and higher bandwidth than GDDR, its latency still can't beat SRAM cache's; even supposing IC is the slowest of the cache on an AMD GPU, latency will still be better vs. HBM and bandwidth will be better when costs are taken into account (you'd need a good few HBM3 8-Hi stacks to match the bandwidth of a couple hundred MB's worth of IC by RDNA 4 gen).
IMHO, IC isn't going anywhere; memory trends have been going towards moving things closer on-chip anyway, so you end up with an arguably worst design ridding of IC and replacing both it and GDDR with just HBM Gen 3 or whatever. But if you balance out the IC capacity with what's a reasonable amount of HBM3, cut out GDDR altogether, you get a very capable design that isn't compromising out of a self-inflicted wound (or compromising at all aside from potentially IC &VRAM capacity, to stay within a certain budget).