This is wrong, but I have a feeling it's pointless to have this discussion, so thanks for saving me the time looking those threads up for you. =D
LOL wtf is this. If you have a retort, post it. You're not doing me any favors pal. Especially not by dangling supposed info in my face.
If you think they have to go through a "testing regime" for the ALUs by "feeding MADDs" to it once they have an entire GPU on their hands to see if it can "do these three flops within the clock at an acceptable rate" then I really have nothing more to say to you.
So can you point out any particular deficiencies regarding threads in Windows that lead to the massive overheads you describe and which are not easily circumvented at the application level in a game setting? Or does your knowledge only extend to throwing around buzzwords?
I oversimplified by saying card and thats my bad but if you to, we get into it. During the design phase of a shader core, the alus are extensively tested for their throughput and relative consistency. No ALU is the same, nor are they measured the same but a MADD operation is by far the most common arithmetic chosen .
As far as good ole windows goes, my knowledge doesn't go past my studies which were all centered around Vista. How much of this is still true two OS' later is beyond me but AFIK Windows is using RR for scheduling. To put it basically it gives a timestamp to a process based on a certain weight, and will interrupt that process if a) the time runs out or b) something with priority takes precedence. I don't know what you're thinking, but
yes, most kernel functions(invisible to the likes of task manager or whatever) take priority over anything extra you may have running and
no its not "easily circumvented at the application level" especially
not in a "game setting". Its a fundamental "flaw" I guess to performance but its required for an commercial OS for obvious reasons. Depending on your hardware you might not even notice but still. Its one of many trade-offs you have to make in a "PC" vs a specific slap together of chips.
The best performance efficient algorithm in my eyes, is "first come, first served". The most common in parallel performance oriented devices, like super computers . I can't say if any previous generation consoles use it but due to Tim Lottes post about PS4 requesting a real-time OS, imma guess its not used.
This is just a little bit of my gripes but maybe you can glimpse the point. I hope.
Did you seriously just try to tell Durante he doesn't know what he's talking about....
Fucking lol.
I wonder how many of the console gamers in this thread actually know what SVOGI stands for.
Yeah I did. Again, the name Durante doesnt mean a damned thing to me. Its not particualy insulting but please drop the shtick like im arguing with some Carmack type techgod.
Him going on about how "he could code some, probably singled threaded, Cuda program and get close to theo max performance" didn't inspire too much confidence.