• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

DX12 update for Tomb Raider out now

guys you should temper your expectations wrt dx12 performance improvements when gpu limited on nvidia cards. other than algorithms that make direct use of the 11.3/12_1 hardware features, nvidias architecture currently cant benefit from multi engine rendering on the gpu side so general performance will at best be equal to dx11.
 

BPoole

Member
10 fps less with fury x in 1440p

With the latest AMD driver (16.3) installed, I'm getting slight improvements on an R9 Fury (1080p):

DX11 Overall: 57.91
Mountain Peak: 60.00
Syria: 58.48
Geothermal Valley: 55.34

DX12 Overall: 59.27
Mountain Peak: 58.09
Syria: 59.87
Geothermal Valley: 59.95

The Geothermal Valley definitely felt quite a bit smoother on DX12 (at least in the benchmark, haven't tried it in gameplay yet).

With the previous driver, I did get about 8FPS less on average with DX12, though, so I guess those who're getting worse performance on AMD GPUs with DX12 are still on the old driver.

I am experiencing worse performance as well

1440p Max Settings except regular AO and no AA
6600k
390X

DX12:
gsdgdsgsqlsn9.jpg

DX11:
 

Easy_D

never left the stone age
Your DX11 bench shows a minimum of 9/10 FPS while DX12 shows a minimum of 36/24 FPS. That's an improvement.

Yes there might be some headroom for the upper limits of the framerate but damn again, that increase in min FPS is insane and should remove a lot of visible stutter. Still shows a lot of promise :D
 

BPoole

Member
Your DX11 bench shows a minimum of 9/10 FPS while DX12 shows a minimum of 36/24 FPS. That's an improvement.

Yes there might be some headroom for the upper limits of the framerate but damn again, that increase in min FPS is insane and should remove a lot of visible stutter. Still shows a lot of promise :D

That is true. When I played I had some of the settings turned down from what they were on this benchmark. Having the smaller fps margin in DX12 would probably be better foir this game considering some major areas are more demanding than others and those downward fps spikes would be mitigated.
 
I haven't checked the benchmark but the game runs a few frames lower and its stuttering all the time now with DX12 on. Using a 290x and a i7 6700K.
 

TSM

Member
Something is really off about these benchmarks. How do you get such huge minimum frame rate improvements and yet have drops in the average frame rate. If you have such a large improvements in frame times you would think the average and maximums would rise as well.
 

tuxfool

Banned
Something is really off about these benchmarks. How do you get such huge minimum frame rate improvements and yet have drops in the average frame rate. If you have such a large improvements in frame times you would think the average and maximums would rise as well.

I'll take a stab at it and suggest that the DX12 path in the game isn't very optimized vs DX11 driver optimizations. However, the DX12 the baseline frame rate when there are a lot more draw calls or when the CPU is more loaded is a lot higher than passing all that stuff through a thick dx11 driver layer.

But since the benchmark doesn't provide a lot of information all that is speculation.
 

TSM

Member
I'll take a stab at it and suggest that the DX12 path in the game isn't very optimized vs DX11 driver optimizations. However, the DX12 the baseline frame rate when there are a lot more draw calls or when the CPU is more loaded is a lot higher than passing all that stuff through a thick dx11 driver layer.

But since the benchmark doesn't provide a lot of information all that is speculation.

People are reporting seeing stuttering even though improvements in the minimum frame rate should mean an overall smoother experience. It seems like the problem is deeper than that.
 

Lonely1

Unconfirmed Member
Specs:
980Ti @stock
i7 5930K @4.5Ghz
32GB DDR4 @2.6Ghz

Settings:
1440p @144Hz.
HBAO+
No AA.

All Max, but Shadows (High), Pure hair (On), Specular Reflections (High).

Dx11

Dx12

Now. The benchmark reports a lower framerate (overall) for DX12, but it might be more consistent, IDK. But what bothered me is that I saw texture flicker under Dx12 during the benchmark, so I see no point into going Dx12 at all for this game atm.
 
Preferred the increased minimal frame rate over DX 11 - but now the game crashes every 10-20 minutes or so in DX 12 on my GTX 970. Likely an NVidia issue as usual
 

hlhbk

Member
The mountains in the very first part of the game textures flicker when DX12 is turned on. anyone else running into this?
 

KainXVIII

Member
guys you should temper your expectations wrt dx12 performance improvements when gpu limited on nvidia cards. other than algorithms that make direct use of the 11.3/12_1 hardware features, nvidias architecture currently cant benefit from multi engine rendering on the gpu side so general performance will at best be equal to dx11.

Ok, but AMD performance is also worse.
 
it just seems odd to write off the fundamental improvement dx12 brings to the gpu side of rendering as an amd advantage that games must specifically target

Well are the games which are currently benchmarking better on AMD specifically tethering heavy use of asynchronous compute? Yes or no?

You're acting like game engines are all going to use this a lot now just because it's available. And that seems unlikely because most games will still target DX11 for compatibility reasons.

Also you're acting like Nvidia won't ever implement a hardware scheduler, what will happen if Pascal has one? I can't wait to see how the AMD fanboys behave when this one vaunted feature of DX12 becomes available on both sides.
 

DSN2K

Member
I feel there is something not right with this DX12 implementation, perhaps deliberately being conservative with its use due to Nvidia issues...
 
Well are the games which are currently benchmarking better on AMD specifically tethering heavy use of asynchronous compute? Yes or no?

You're acting like game engines are all going to use this a lot now just because it's available. And that seems unlikely because most games will still target DX11 for compatibility reasons.

Also you're acting like Nvidia won't ever implement a hardware scheduler, what will happen if Pascal has one? I can't wait to see how the AMD fanboys behave when this one vaunted feature of DX12 becomes available on both sides.

lol. not an amd fanboy. good try tho. id imagine most if not all of the high profile games that support dx12 will absolutely use multi engine gpu rendering. its already starting to gain heavy traction in the console space as its the best way to keep gpu utilization as high as possible at all times. i dont see why they wouldnt use it. future nvidia gpu architectures that may possibly support multi engine gpu rendering have no relevance today. i hope they do, but i have my doubts that it will come with pascal.

and why wouldnt dx12 games make heavy use of async compute? im trying hard to think of reasons why it would be in a developers interest not to use something they probably already are on the consoles but im coming up blank.
 
Update still didn't fix the graphic glitches that pop up in my game where parts of the environment (and eventually the whole screen) gets tinted in solid colors or even turning completely black.

The Mines were a pain to go through under these conditions :(

At least the performance went way up! That's something at least.
 
lol. not an amd fanboy. good try tho. id imagine most if not all of the high profile games that support dx12 will absolutely use multi engine gpu rendering. its already starting to gain heavy traction in the console space as its the best way to keep gpu utilization as high as possible at all times. i dont see why they wouldnt use it. future nvidia gpu architectures that may possibly support multi engine gpu rendering have no relevance today. i hope they do, but i have my doubts that it will come with pascal.

and why wouldnt dx12 games make heavy use of async compute? im trying hard to think of reasons why it would be in a developers interest not to use something they probably already are on the consoles but im coming up blank.

It might be because I dunno a console is a fixed hardware and PCs aren't? I'm not a rocket scientist but even I know that the total number of gaming PCs out there with the hardware which greatly benefits from async compute, specifically a very weak CPU with a strong GPU which is made by AMD, is a small percentage of the market. Async compute is amazing for helping out the netbook-class CPUs in the PS4 and Bone. It's not quite as useful when your gaming PC has an i7. Even the vaunted Ashes of the Singularity, the poster child for async compute, has Nvidia DX11 performing similarly to AMD DX12.
 
It might be because I dunno a console is a fixed hardware and PCs aren't? I'm not a rocket scientist but even I know that the total number of gaming PCs out there with the hardware which greatly benefits from async compute, specifically a very weak CPU with a strong GPU which is made by AMD, is a small percentage of the market. Async compute is amazing for helping out the netbook-class CPUs in the PS4 and Bone. It's not quite as useful when your gaming PC has an i7. Even the vaunted Ashes of the Singularity, the poster child for async compute, has Nvidia DX11 performing similarly to AMD DX12.

i dont think you understand what async compute is. it has nothing to do with the cpu. its concurrent execution of compute threads and various other tasks on the gpu. heres a good read

http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading

just ignore the part about maxwell 2's 1+31 configuration. its technically correct but nvidia can not execute graphics and compute simultaneously. a full flush of the entire gpu is required every time you switch between the 2. heres 2 links that better explain nvidias current position. the first of which strongly hints that volta is the earliest we could see real changes to nvidias ability to benefit from multi engine gpu rendering

http://www.extremetech.com/extreme/...ading-amd-nvidia-and-dx12-what-we-know-so-far
http://ext3h.makegames.de/DX12_Compute.html

another link with good info

http://www.overclock.net/t/1572716/directx-12-asynchronous-compute-an-exercise-in-crowd-sourcing
 

charsace

Member
I am on a lappy with a 970m.
In DX12 I lose a frame off the average, but the min frame rate is 20-25fps higher. I also notice that the draw distance is better in DX12 even though I changed no settings.
 
damn again, that increase in min FPS is insane and should remove a lot of visible stutter.

the game runs a few frames lower and its stuttering all the time now with DX12 on.

I'm also reading quite a bit of reports about the actual game running worse than the benchmark and stuttering a lot more, also lots of crashes in DX12.

Can we call on both of these being essentially frauds:
1- Nvidia promising DX12 fully retrocompatible, when it is VERY OBVIOUSLY not (asynch computing still disabled in DX12)
2- this official? dev blog: http://tombraider.tumblr.com/post/140859222830/dev-blog-bringing-directx-12-to-rise-of-the-tomb
What's THAT? They call 46 -> 60 fps improvement on a 970. That is the actual 10-20% improvement we expected from DX12 and that was always used to publicize it (see Mantle and everything else).
WHERE IS THIS MAGIC DRIVER they used in that article?
Why no one can have it?
Why the released version crashes, stutters and gives lower performance on SAME hardware despite that dev blog claimed to have a functional driver showing performance increase?
AT LEAST Mantle WAS marginally faster. Not slower.

This is the beginning of bleak times I ranted about many times:

- Right now game optimization was mostly done at drivers level. PC games have time-starved ports, and it means not much time is spent to optimize things natively on PC. This is the essence of PC gaming: not enough time to optimize because profit margins are smaller than on consoles, and the work volume of a magnitude larger because of so many different configurations.

- With DX12 games engines are not anymore dependent on drivers optimization, but they require game developers to spend time and optimize directly. But developers were already time-starved in the first scenario. Nor they have/know detailed hardware specs, since video hardware is closed. No one knows very much HOW to optimize, and optimizing on a model-by-model basis on PC is INSANE, because there aren't two like on consoles, but DOZENS of them, each with its own quirks and hardware bugs. This is a quote from a recent Anandtech article:
http://www.anandtech.com/show/10136/discussing-the-state-of-directx-12-with-microsoft-oxide-games
is one major hurdle for new developers, particularly those who don’t have a firm grasp on the hardware. The low-level nature of DX12 means that more control over optimizations will be in the hands of developers – and they will need to rise up to the challenge for best results – as opposed to video card drivers.
How can they "rise up to the challenge" when right now the big problem is that PC ports are RUSHED? If DX12 requires MORE time and expertize how can we expect it an improvement if *time* was the problematic variable in the first place?

And:
As Baker notes, since the PC is such a large and varied platform “You can never perfectly optimize for every platform because it's too much work” as compared to the highly regulated consoles, so instead the name of the game is making generic optimizations and try to be as even-handed as possible. At the same time the company has also been atypically transparent with its code, sharing it with all of the GPU vendors so that they can see what’s going on under the hood and give feedback as necessary.

Translation: "to the metal" mantra on PC is basically useless. Why? Because the hardware specs are hidden, because you will never have time to optimize on specific models, because you still have to wait on Nvidia to tell you what to do (assuming they are willing to and are in a good disposition). You try to write fast code in DX12, but it's not fast. At that point you either go through INFINITE trial & error, or you send code to Nvidia and hope they optimize it for you by looking at their locked in documentation and explain to you why something isn't working as it should. Same as when you sent them the code and wait for them to optimize the drivers, only that now there are exponentially MORE steps to go through and infinite back and forth since the guy that has the knowledge (Nividia engineers with secret hardware documentation) is not the guy who does the optimization (game dev who has no clue why something he just wrote for an optimization pass made everything slower). See Michael Abrash articles on optimization that explain why theoretically perfectly optimized code IN THEORY can perform much slower IN PRACTICE. And without direct hardware knowledge, that ONLY Nivida have, due to the fact they have the specs and wrote the drivers all these years, you just cannot write optimized code. You can't. The actual game dev has no clue on how to optimize. And he realistically won't have any clue for YEARS, even assuming wasting times for years painstakingly gathering that knowledge, while new models come out every year, creating an unending cycle of suck.

And that's why PC is so troubled to develop on and will never have good optimization, even less on DX12 premise:
https://twitter.com/FioraAeterna/status/674020750253096960
true fact: hardware devs are sometimes nervous about helping you optimize arch N, because it means their arch N+1 looks worse by comparison
https://twitter.com/FioraAeterna/status/703294520369160196
considering how much some gamers freaked out at the "3.5GB" thing, revealing how broken GPUs are might make their heads actually explode
https://twitter.com/rygorous/status/653041674218672128
Everybody wants the level of control but nobody wants the responsibility that goes along with it. :)
if DX12/vulkan are magic bullets, their ability is to auto-aim at the programmer's foot
 

bee

Member
dx 12 certainly runs better for me on a titan x at 4k, super stable framerates (37-45 for 30mins of play) no sudden dips, no crashes, no stutter. wish vxao worked though

the benchmark doesn't seem indicative of actual in game performance in my experience
 
blah blah blah blah

So you've played on DX12 game, a game that has had support patched in and just seen it's first release, where the developers openly say that for some people DX12 may run worse and where they say they're still working on it, and your reaction is:

"This is just the beginning of a disaster!"

The performance improvements the developers cite is on a specific hardware situation where the person has a relatively slower (note, not SLOW, slower) clocked multi core CPU that is currently getting bottle necked by a single core in *specific* areas of the game because of the DX draw calls. With DX12 that work gets distributed over the multiple cores and those specific areas don't get bottle necked by one core on the CPU.

Lots of people were running into that issue in Geothermal Valley in the game. People here had even reported that they were getting CPU limited in such areas.

So sure, right now, DX12 is a lot less optimized and developers are just starting to get to grips with it, but for one group of people, an area of this game is already showing the benefits of it.

With DX12 you don't have to code closer to the metal if you don't want, first of all. Secondly the talented programmers on major engines will be able to leverage this into their engines, and multiple games will benefit. Any game using Unreal Engine, Unity, Frostbyte, etc will see major benefits.

And this is just the start of that. It's like with the Vulkan (another closer to the metal API) version of The Talos Principle. Acknowledging that it doesn't run faster than DX11 yet, it's already handily beating the Open GL version, and that demonstrates that this is just a first step.

Or the TLDR: You are obviously so eager to claim you are right on your prediction that closer to the metal APIs will be a disaster, that you have completely jumped the gun.
 

knerl

Member
If anything, expect your framerates to drop even lower.

For me I get 60fps in that area using DX12 where DX11 didn't give me 60fps.
That's while using the same settings. Purehair, specular reflections on very high, the rest on high with hbao+. 1080p.

EDIT: 4-10fps increased performance at best it seems. Really minor change. For now I'd rather stick with DX11 and the use of Reshade.
Hope it'll get better. On a GTX970, 2500K, 16GB RAM here.
 

frontieruk

Member
Of course it has to do with the CPU. Where do you think many of those compute tasks were being done before?

On the GPU. Just not concurrently.

remember that DirectX 12 is about making the API more efficient so it can take better advantage of multi-core CPUs. It's not really about graphics cards. It's about exploiting more performance from CPUs so they don't bottleneck the GPU.
 
remember that DirectX 12 is about making the API more efficient so it can take better advantage of multi-core CPUs. It's not really about graphics cards. It's about exploiting more performance from CPUs so they don't bottleneck the GPU.

This is known. But asychronous compute, which is what we were talking about, is about taking compute tasks from the GPU and running the concurrently with other tasks... instead of a serial execution as they are done under dx11.
 

tuxfool

Banned
Of course it has to do with the CPU. Where do you think many of those compute tasks were being done before?

On the GPU. Just not concurrently.

Yup. And to further expand on that, Async compute doesn't make viable tasks that were previously unsuitable for a GPU to execute, it just allows one to maximise execution resources (and low latency and preempting tasks on GCN).

As I said it is analogous to Hyperthreading where idle resources (blocked by some other task) are enabled to execute a different task, and like hyper threading its utility will be highly dependent on the kind of tasks the GPU is executing.
 

Kezen

Banned
VXAO looks really good. Very expensive, but we're getting close to offline quality in occlusion here.
From Computerbase: No AO / AO / VXAO

It does look excellent indeed. On my 980 it stutters a lot in areas tested (geothermal valley) with it on, drops to 40fps and even triple buffering can't mask how ugly this is on a 60hz panel.

Pascal can't come soon enough.
 

Teletraan1

Banned
This thread reads like some people's first Directx rodeo. The new version of Directx is usually a bit slower and occasionally buggy at first and then pulls ahead shortly after release as more devs get used to it and drivers mature. Plus this is not a native DX12 game.

Good to see higher minimums and more steady frames being reported even if the averages aren't quite as high. Should improve a bit as everything matures.

I wasn't even aware that vendor specific features didn't work out of the box on these MS Store versions of these games. The limitations of these versions are pretty extensive and franky make these versions pretty unpalatable. Outside of MSGS exclusives and pricing errors why would you trade so much for little to no benefit to anyone outside of MS?
 

Buburibon

Member
VXAO looks really good. Very expensive, but we're getting close to offline quality in occlusion here.
From Computerbase: No AO / AO / VXAO

Oh yeah, it makes a pretty big difference actually. So much of a difference, in fact, that I've restarted the campaign, and will play through the entire thing with VXAO now that's available. The interesting thing about performance I've noticed so far is that early areas are now considerably more demanding with VXAO, but the previously most demanding section the game, Geothermal Valley, is only slightly more demanding with the new AO. Does that even make any sense at all? I would've expected the GPU load with VXAO to go up proportionally in all areas.
 

Durante

Member
Oh yeah, it makes a pretty big difference actually. So much of a difference, in fact, that I've restarted the campaign, and will play through the entire thing with VXAO now that's available. The interesting thing about performance I've noticed so far is that early areas are now considerably more demanding with VXAO, but the previously most demanding section the game, Geothermal Valley, is only slightly more demanding with the new AO. Does that even make any sense at all? I would've expected the GPU load with VXAO to go up proportionally in all areas.
It does make sense to some extent. Assume that VXAO is a rather fixed load regardless of the scene complexity. In that case, in complex scenes, the relative performance impact will be significantly lower than in simple scenes.
 

Maxey

Member
So can I have a quick rundown of how VXAO works? Is it still based on a depth buffer or does it actually know where objects are and how close they are to other things?
 

tuxfool

Banned

Pachinko

Member
I didn't feel like print screening my benchmark results but I'll echo the results above -

i7 6700k, 32GB system ram @3200mhz, 8 GB 390X , I haven't overclocked anything myself yet although I'm told it's not to hard to accomplish this with the CPU. Anyway -

Direct X 11 -
Mountain Peak - Average 57.38/ Low 27.87 / High 94.76
Syria - Average 58.42/8.22/76.94
Geothermal Valley - 55.58/11.24/61.16

Direct X 12 -
Mountain Peak - 54.91/38.11/70.78
Syria 53.83/41.39/62.07
Geothermal Valley - 48.63/35.57/63.52

Those minimum FPS differences are pretty astounding but I noticed some oddball issues , despite the huge improvement in the bottom , the DX11 test ran FAR better , like the average FPS counter doesn't really lie , it looked like a locked 60 FPS for the mountain top test where in DX12 it falls to that lower number for much longer the second the camera turns to show the moutain top itself. The syria test had that 8.22 fps in DX 11 because it had a loading hiccup almost immediatly after starting up , otherwise it ran smooth as butter. Geothermal valley looked almost the same on both tests visually. DX12 also has another knock against it , I get a strange bit of graphical corruption when the game is using the falling snow , 2 sets of 5 transparent bars show up on the screen for the first test. It's probably something a driver update will tweak or maybe another patch later on but I think I'll stick with DX11 for now.
 

Buburibon

Member
It does make sense to some extent. Assume that VXAO is a rather fixed load regardless of the scene complexity. In that case, in complex scenes, the relative performance impact will be significantly lower than in simple scenes.

I see, that definitely helps make sense of what I'm seeing. Thanks!
 
Top Bottom