• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Advanced DirectX12 Graphics and Performance - Discussion

Iorv3th

Member
Oh god that sounds horrible to me, I do a clean install every 6 months, no matter how well the thing is running. Am I mad GAF?

You are mad if you are doing it for performance reasons. If you just really enjoy the process of doing it, than I guess you have a valid reason?

But it brings no benefit whatsoever. I know there are people that believe in windows degredation, but there are things you can run to clean things up to where you aren't really getting any benefit doing a clean install. CCleaner and use the registry cleaner and run chkdsk every now and then.
 
im sorry but when someone starts throwing round the combined bandwidth value of the X1 I just turn off.

They go into a lot more detailed discussion around the calcs and he agrees that you can't just smack them together and call it a day.

https://forum.beyond3d.com/threads/directx-12-the-future-of-it-within-the-console-gaming-space-specifically-the-xb1.55487/page-55

That's a good question. I'm certain that it's unlikely the average bandwidth for Xbox One is only slightly larger than 68GB/s. That would put it in approximately Radeon 4850 territory. You also have to factor in CPU contention with that part of the memory as well, it would be operating well below 30 GB/s then. The performance in titles today are likely using much more therefore esram utilization must be better than we expect. Peak bandwidth in game code has been recorded at ~140 GB/s in esram. So it comes to question how often that number is sustained.

It only occurs to me that this is like a reverse Geforce 970 situation.

IIRC ESRAM it's 7/8 cycles can read/write simultaneously. On the 8th cycle it must do nothing. It can do 8 reads in a row or 8 writes but doing both it must rest on the 8th cycle. That is how 192 is obtained.

As for the article yes it is about full heterogenous systems, the article is about shifting bottlenecks which in this case the requirement of bandwidth required to no longer be the bottleneck (for 32 CUs). 700GB/s was the amount that they found.

As for your assertion that Xbox one cannot achieve 242 GB/s it can simultaneously modifying existing data on the eSRAM while reading and writing to data in DDR3. They are separate pools. The idea to keep the workload on esram. Data is moved in and out of ESRAM as required.

The article was topically about GPGPU on heterogenous systems. As they begin to move towards HBM they determined that the GPU side would get bigger so eventually the bandwidth would need to increase as well. In this case 700 GB/s was found to be enough for all workloads for 32CU. Or in the easiest case scenario to 350GB/s while it writes out 350GB/s. I'm unsure if asymmetrical workloads are considered ie: reading 600Gb/s and say performing a reduce and only writing out 100Gb/s.

Bandwidth is only a measure, the article does make a good point of how idle ALU units are. Which is why in the past running some GPU tests could burn your chip as those algorithms never gave your chip a chance to rest and stock cooler was not adequate enough for sustained 100% usage.

Linking back to my OP this was the concern. We are moving to an area where CPU load is reduced and it can now focus on other things while the GPU load is greatly increased. Which means ultimately both have now room to increase. This will lead to greater demands from the PSU and greater demands from cooling.

I'm hesitant to respond to this because it's more complex than you write it. Some of the senior members will be able to provide better insight into how specific memory pattern usage would ultimately increase bandwidth.

Esram is designed to only hold 2/3 of the back buffer and textures are streamed into esram when it needs it, it does not need to stream the textures back. In fact it's almost useful to look at ESRAM as nearly a black hole with the exception that the render target will eventually come back.

The remaining space in ESRAM is used for scratch pad work which is a majority of where the bandwidth should be consumed.

Bandwidth is a unit of transportation. I've never considered it a measure of speed. Speed is a measure of velocity and distance, bandwidth is an aggregation of data moved over a set interval of time. As such you would agree that electricity nearly runs at the same speed as light, so why is 56K on a phone line yet we have so much more on OC48? It's because OC has multimode frequencies all sending different data in that all at the same time. The data isn't necessarily flowing faster than light, there is just more light than can be processed. In telecom we aggregated smaller data streams into larger one and those larger ones go cross country over large optical pipes or through satellite or whatever may be the case, but we are still bound by the same speed which is the speed of energy.

edit: that example might lead to OT. So lets make it simple. I have 8GB GDDR on PS4. I do the following:
int hello = 8;
int world = hello;

If I have 4 GB GDDR and do the same thing, will 8GB do that same statement in 1/2 the time?

Spoiler

So bandwidth is not speed, bandwidth is about how much data can be fit. When you do the following statement:
hello = world + hi + var2+ var3+ object12 * happyworld;
Then we're pulling lots of data from different locations all at the same time that having a wide bus can pull from. If I don't have a wide enough bus I will need to spend additional clock cycles gathering that data before I can begin processing.

The GPU is an aggregator of bandwidth, whatever lines are free for it to pull and push is what it can work with. When you made the example of 2 fax machines capable of doing 1 fax per min, vs a fax machine that can do 2 fax per minute, that is a comparison of processing speed.

You haven't sold me on why it all needs to come from a single pool or why it needs to be 8GB yet.
 
Doing some math will indicate that @ 853 Mhz at 12 CUs, the required bandwidth to remove the bandwidth bottleneck for CUs will be approximately ~240 GB/s. This happens to be approximately the total complete bandwidth of Xbox One (192 GB/s + 67GB/s).


I love it. This gen has been just as crazy for the nutters as previous ones despite how much more obvious it is.
 
Everyone in that thread seem to have "some general knowledge", and some of them disagree with iRoboto. So I'm just asking, why should I take his word instead of what the others are saying?

I never said you should. I was just posting his commentary around the power usage and the cooling situations with both consoles which I thought was an interesting concern regarding the consoles as they go forward.
 

curb

Banned
I never said you should. I was just posting his commentary around the power usage and the cooling situations with both consoles which I thought was an interesting concern regarding the consoles as they go forward.

It seems to me that people haven't give Microsoft enough credit with regards to the design of the XB1. I've seen lots of people just credit the size of the machine to wanting to avoid another RROD scenario. I think there's also an assumption that Sony caught them completely off guard and that the XB1 was a rushed machine. It does seem to me that despite maybe being caught of guard in some areas, Microsoft has at least had a long view of the console as far as DX12 is concerned. If they can leverage a piece of hardware the way DX12 seems to be promising, the XB1 seems much more capable of handling the power usage and cooling requirements. It'll be interesting to see what features from DX12 do manage to bleed over into the PS4 APIs and how they will affect power and cooling.
 
It seems to me that people haven't give Microsoft enough credit with regards to the design of the XB1. I've seen lots of people just credit the size of the machine to wanting to avoid another RROD scenario. I think there's also an assumption that Sony caught them completely off guard and that the XB1 was a rushed machine. It does seem to me that despite maybe being caught of guard in some areas, Microsoft has at least had a long view of the console as far as DX12 is concerned. If they can leverage a piece of hardware the way DX12 seems to be promising, the XB1 seems much more capable of handling the power usage and cooling requirements. It'll be interesting to see what features from DX12 do manage to bleed over into the PS4 APIs and how they will affect power and cooling.

Agreed, they built the console to be quiet and cool and I don't think the RROD was the only consideration. I never really thought about the heatsink on the XBO and its impact on cooling vs the PS4s structure.
 

curb

Banned
Agreed, they built the console to be quiet and cool and I don't think the RROD was the only consideration. I never really thought about the heatsink on the XBO and its impact on cooling vs the PS4s structure.

I remember there being concerns over the PS4's cooling before launch and so far, it's been fine. Given some of the considerations from iRoboto I think there's a lot of unknowns yet about what could happen should some of these features be implemented in the PS4 APIs but I can't help but wonder if the lack of an external power brick could have any potential negative impact. We don't really know of course and perhaps Sony prepared for heavier resource usage down the line. The next few years will be interesting though.
 
Top Bottom