Support NeoGAF

moniker · Mar 21, 2016

dr_rus said:
Lots of cool stuff there as well:

Both the fixed foveated rendering and multi gpu performance increases seem kind of disappointing?

Dictator93 · Mar 21, 2016

pottuvoi said:
"Great Management of Technical Leads" (Mika Acton / Insomniac Games)

Optimizing the Graphics Pipeline with Compute (Graham Wihlidal / Dice)

Object Space Lighting following Film rendering 2 decades later in real time (Dan Baker / Oxide Games)

All wonderful presentations. Oxide game's render is pretty darn unique.

dr_rus · Mar 21, 2016

pottuvoi said:
Optimizing the Graphics Pipeline with Compute (Graham Wihlidal / Dice)

Rather disappointing gains from async compute again (2% PS4, 5% XBO, 12% FuryX if I'm not mistaken). But the gains from compute based culling are extreme on the other hand. GCN's geometry pipeline need to improve.

Dictator93 said:
All wonderful presentations. Oxide game's render is pretty darn unique.

They're definitely unique in shading all stuff in a 4Kx4K buffer even when a user is running in 1080p resolution. 512 bit GDDR5 and HBM cards will probably shine there 8)

Purple Cheeto · Mar 21, 2016

Thanks for the links. I hate how most of the stuff gets locked behind paywalls.

icecold1983 · Mar 21, 2016

dr_rus said:
Rather disappointing gains from async compute again (2% PS4, 5% XBO, 12% FuryX if I'm not mistaken). But the gains from compute based culling are extreme on the other hand. GCN's geometry pipeline need to improve.

They're definitely unique in shading all stuff in a 4Kx4K buffer even when a user is running in 1080p resolution. 512 bit GDDR5 and HBM cards will probably shine there 8)

its just 1 use case for async compute. its probably much easier to saturate the smaller amt of CUs on the consoles as oppose to the massive arays on higher end pc parts.

Kaako · Mar 21, 2016

Thanks guys for linking up the archive videos. Keep em coming! Great stuff.

Caayn · Mar 21, 2016

icecold1983 said:
30 to 35% increase by adding a 2nd gpu or am i interpreting it wrong?

Looks like it. I had hoped that VR would benefit more from multi-gpu set ups. 35% is disappointing.

tuxfool · Mar 21, 2016

Caayn said:
Looks like it. I had hoped that VR would benefit more from multi-gpu set ups. 35% is disappointing.

This is just one implementation, but it is known that SFR actually is less scalable than AFR.

What it does allow is decent frame pacing which is something that is highly desirable.

dr_rus · Mar 21, 2016

icecold1983 said:
its just 1 use case for async compute. its probably much easier to saturate the smaller amt of CUs on the consoles as oppose to the massive arays on higher end pc parts.

Yeah, well, it's one case but it's not the first which gives that ~+10% performance result on PC GCN cards.

It's also highly questionable if such compute culling optimization will actually help anything but GCN - it's pretty clear that consoles are the main target of these compute tricks which stem from the issues the GCN geometry pipeline has. I also kinda wonder if the primitive discard unit shown in GCN4 will perform something like this in h/w for Polaris.

icecold1983 · Mar 21, 2016

dr_rus said:
Yeah, well, it's one case but it's not the first which gives that ~+10% performance result on PC GCN cards.

It's also highly questionable if such compute culling optimization will actually help anything but GCN - it's pretty clear that consoles are the main target of these compute tricks which stem from the issues the GCN geometry pipeline has. I also kinda wonder if the primitive discard unit shown in GCN4 will perform something like this in h/w for Polaris.

most development optimizations are going to be for amd going forward. its going to be an uphill battle for nvidia. 10 to 15% here and there is nothing to scoff at. esp considering its just extra performance without any cost to the user. good developers have pretty much solved gcns geomoetry weakness. its not even an issue outside of nvidias over tessellated gameworks libraries.

dr_rus · Mar 21, 2016

https://www.youtube.com/watch?v=tVHH3-bP-fE
"THE GIFT" GDC2016 Trailer (created using "MARZA Movie Pipeline for Unity")

icecold1983 said:
most development optimizations are going to be for amd going forward. its going to be an uphill battle for nvidia.

This remains to be seen. There's only so much optimization you can perform on a fixed architecture before the gains become negligible. And when that happens you're starting to research other stuff on PC h/w. I'd say that by 2017 the industry will squeeze all the juices from console APUs and move on to the future h/w.

BriareosGAF · Mar 21, 2016

icecold1983 said:
most development optimizations are going to be for amd going forward. its going to be an uphill battle for nvidia. 10 to 15% here and there is nothing to scoff at. esp considering its just extra performance without any cost to the user. good developers have pretty much solved gcns geomoetry weakness. its not even an issue outside of nvidias over tessellated gameworks libraries.

Low utilization during depth-only passes (or from zero pixel draw due to poor occlusion/hiz) is definitely an issue independent of tessellation. The one launch per SE referred to in the paper is something folks deal with in varying ways (caching results of shadow draws, pre-culling on both CPU and GPU, etc.).

Also async workload doesn't really have anything to do with CU count per se, unless your workloads are so tiny they can't fill up available CUs. This isn't usually the issue, the issue is poor wavefront utilization due to high resource usage (usually VGPR) resulting in low SIMD occupancy. If you have low resource workloads on compute, especially if they're ALU heavy and don't steal memory bandwidth, you can find wins. But the better you get at reducing low utilization parts of your frame, the less opportunities you have to leverage async.

Kezen · Mar 21, 2016

I have to say I was expecting much more gains due to async compute bearing in mind the hub hub about this feature.

pottuvoi · Mar 22, 2016

Temporal Reprojection Anti-Aliasing in Inside (Lasse J F Pedersen / PLAYDEAD)

icecold1983 · Mar 22, 2016

http://goo.gl/2Gl6MZ

glacier engine slides

c0de · Mar 22, 2016

dr_rus said:
Rather disappointing gains from async compute again (2% PS4, 5% XBO, 12% FuryX if I'm not mistaken). But the gains from compute based culling are extreme on the other hand. GCN's geometry pipeline need to improve.

What I do find interesting is that the times for XBO and PS4 are so close together although this is data coming from the rather slow DDR3 RAM.

dr_rus · Mar 22, 2016

c0de said:
What I do find interesting is that the times for XBO and PS4 are so close together although this is data coming from the rather slow DDR3 RAM.

Console engines are optimized for low memory bandwidth obviously. An engine which is optimized for XBO's DDR3 limitations won't benefit from PS4's x2 bandwidth as it simply won't need it. This is where the lower common denominator kicks in and where PS4 exclusive titles have an opportunity to shine by comparison.

dr_rus · Mar 22, 2016

icecold1983 said:
http://goo.gl/2Gl6MZ

glacier engine slides

5-10% again.

BigTnaples · Mar 22, 2016

Was really looking forward to this one...

Global Illumination in 'Tom Clancy's The Division'
Nikolay Stefanov | Technical Lead, Ubisoft Massive
Location: Room 2016, West Hall
Date: Friday, March 18
Time: 1:30pm - 2:30pm
Format: Session
Track:
Programming,
Visual Arts
Pass Type: All Access Pass, Main Conference Pass - Get your pass now!

The session will describe the dynamic global illumination system that Ubisoft Massive created for "Tom Clancy's The Division". Our implementation is based on radiance transfer probes and allows real-time bounce lighting from completely dynamic light sources, both on consoles and PC. During production, the system gives our lighting artists instant feedback and makes quick iterations possible.
The talk will cover in-depth technical details of the system and how it integrates into our physically-based rendering pipeline. A number of solutions to common problems will be presented, such as how to handle probe bleeding in indoor areas. The session will also discuss performance and memory optimization for consoles.
Takeaway

Attendees will gain understanding of the rendering techniques behind precomputed radiance transfer. We will also share what production issues we encountered and how we solved them - for example, moving the offline calculations to the GPU and managing the precomputed data size.
Intended Audience

The session is aimed at intermediate to advanced graphics programmers and tech artists. It will also be of interest to lighting artists who are interested in improving their workflow. Knowledge of key rendering techniques such as deferred shading and 3D volume mapping will be required.

Hopefully someone here will post info or slides.

mckmas8808 · Mar 22, 2016

dr_rus said:
Console engines are optimized for low memory bandwidth obviously. An engine which is optimized for XBO's DDR3 limitations won't benefit from PS4's x2 bandwidth as it simply won't need it. This is where the lower common denominator kicks in and where PS4 exclusive titles have an opportunity to shine by comparison.

Interesting.

c0de · Mar 22, 2016

mckmas8808 said:
Interesting.

His opinion on this? Sure.

dr_rus · Mar 22, 2016

c0de said:
His opinion on this? Sure.

You have a different one?

Bl@de · Mar 22, 2016

Does anybody have a list of Youtube links to talks? Really enjoyed Carmacks VR talk and the Vulkan presentation last year. Slides alone are not that useful when it comes to presentation recap

Kralamoonard · Mar 22, 2016

c0de said:
His opinion on this? Sure.

Err do you have anything to add to that rather than saying it's just his opinion? No offence, but it just comes across as pretty snarky and pretentious that you're trying to reduce his claim to just an 'opinion' without offering any insight at all to why it maybe an opinion rather than a fact.

Not very productive to the conversation, no?

icecold1983 · Mar 23, 2016

dr_rus said:
5-10% again.

I never claimed it was going to bring huge increases in perf, altho i can see it bringing maybe 15 to 20 in the best use cases as developrs come to grips with it. Ands again, itsfree perf just by installing a new vers of windows and just extends the lead amd already has lately. Amd being ~30% faster under dx12 titles is pretty substantial

c0de · Mar 23, 2016

dr_rus said:
You have a different one?

"Console engines are optimized for low memory bandwidth obviously."
To me it is not obvious as they started their slides with the CU specs and not with memory specs. To me it is
a) not obvious that the engine is optimized for low bandwidth and
b) coming from the wrong hypothesis, this doesn't mean that first party would make better use of it.
Especially when you look at what PC is able to do. Wouldn't it also suffer from the XBO? Or do you think that they make a console version and a PC version?
The biggest difference as shown in their starting slides is GPU performance and then in the end results are shown. Drawing the conclusion that the difference isn't that big because they "gimped" the PS4 console engine version sounds strange and is not proved by the slides.
That's why I think it is not "obvious".

BriareosGAF · Mar 23, 2016

If there's one thing I've learned posting on GAF, it's that real technical knowledge is not particularly valued--these discussions are almost always proxies for validating some pre-existing thesis along a YCS axis. It is what it is.

dr_rus · Mar 23, 2016

c0de said:
"Console engines are optimized for low memory bandwidth obviously."
To me it is not obvious as they started their slides with the CU specs and not with memory specs. To me it is
a) not obvious that the engine is optimized for low bandwidth and
b) coming from the wrong hypothesis, this doesn't mean that first party would make better use of it.
Especially when you look at what PC is able to do. Wouldn't it also suffer from the XBO? Or do you think that they make a console version and a PC version?
The biggest difference as shown in their starting slides is GPU performance and then in the end results are shown. Drawing the conclusion that the difference isn't that big because they "gimped" the PS4 console engine version sounds strange and is not proved by the slides.
That's why I think it is not "obvious".

You think PC doesn't suffer from these optimizations? Take a look at how Fury X is doing against 980Ti. Optimizing for low bandwidth is the first thing you should do on a console h/w, CUs and all else comes after that or as a results of that - allowing you to load the h/w with math while the memory fetch is running.
When your shaders are balanced in such a way which allows them to execute on XBO without stalling from memory latency - any additional bandwidth is essentially wasted on them because they can't saturate it unless the GPU is proportionally faster in math. This may be somewhat true for PC h/w but for PS4 the GPU is just 30% more powerful while the bandwidth it has is 2.5 times more. This will lead to bandwidth being underused on PS4 in multiplatform titles. Which is actually illustrated quite well by most PS4 versions of multiplatform titles running in a higher resolution.

Dictator93 · Mar 23, 2016

Kezen said:
I have to say I was expecting much more gains due to async compute bearing in mind the hub hub about this feature.

I think if you are already pushing the GPU hard in a lot of places to produce a frame, it is hard to get a lot out of it as there is just too much "pressure".

pottuvoi said:
Temporal Reprojection Anti-Aliasing in Inside (Lasse J F Pedersen / PLAYDEAD)

This was also great. I like how he went into the sample pattern and even a way to deal with the resolve/disolve artifact problem.

Cool!

edit:
Flying WildHog's Krzysztof Narkowicz ‏put out a blog collating all the GDC papers he could find atm.

dr_rus · Mar 23, 2016

Dictator93 said:
edit:
Flying WildHog's Krzysztof Narkowicz ‏put out a blog collating all the GDC papers he could find atm.

Great find!

dragonelite · Mar 24, 2016

Dictator93 said:
I think if you are already pushing the GPU hard in a lot of places to produce a frame, it is hard to get a lot out of it as there is just too much "pressure".

Pretty sure the dev all profiled and look up where the gaps were, and optimized their rendering pipeline/process to minimize those gaps.

dr_rus · Mar 25, 2016

Dictator93 said:
edit:
Flying WildHog's Krzysztof Narkowicz ‏put out a blog collating all the GDC papers he could find atm.

This post is updated with:

“Math for Game Programmers: Building A Better Jump” – Kyle Pittman (Minor Key Games)
“An Excursion in Temporal Supersampling” – Marco Salvi (NVIDIA)
“Improving Geometry Culling for ‘Deus Ex: Mankind Divided'” – Nicolas Trudel (Eidos-Montréal)
“Fast, Flexible, Physically-Based Volumetric Light Scattering” – Nathan Hoobler (NVIDIA) - this one is NV's Volumetric Lightning actually which they've open sourced on GDC
“Building Paragon in Unreal Engine 4” – Benn Gallagher, Martin Mittring (Epic Games)

“Digital Humans: Crossing the Uncanny Valley in Unreal Engine 4” – (Epic Games)
“A Real-Time Rendered Future” – (Epic Games)
“Visual Effects Roundtable” – Drew Skillman (Google)

etc

Some slides from “Building Paragon in Unreal Engine 4” talk:

BriareosGAF · Mar 25, 2016

I'll never forget my shock when I first started working on a UE3 game many years ago and found they had no parallel command buffer generation, and only two SPU modules, both packed into the same elf, one of which was just EDGE culling, the other fragment program patching of course. We actually ran out of bits in a 64-bit flag mask on MW3 because we had so many modules (my port of some of the core physics modules pushing it over the limit). Good times.

More_Badass · Mar 25, 2016

The Rain World devs did a talk on procedural animation, but unfortunately I don't think it's available publically

Kezen · Mar 25, 2016

So Paragon is sub 1080p on PS4 ?

Kralamoonard · Mar 25, 2016

Kezen said:
So Paragon is sub 1080p on PS4 ?

Yep, but it's 60fps so no surprise tbh. It's following the Uncharted 4 MP and Star Wars Battlefront route.

Bl@de · Mar 25, 2016

The Kronos Group uploaded the Vulkan presentation to Youtube. Low quality but it's enough for a presentation.

Part 1

Part 2

dragonelite · Mar 25, 2016

Bl@de said:
The Kronos Group uploaded the Vulkan presentation to Youtube. Low quality but it's enough for a presentation.

Part 1

Part 2

Do the higher versions need some more processing time?

bj00rn_ · Mar 25, 2016

Blanquito said:
Fixed foveated rendering is just having the center of each eye higher quality, and doesn't change depending on where your eyes are actually looking, right?

Do you spend a majority of your time looking at the center of screen, and using your head to look around? If so, that seems like a decent solution to reduce resources required.

This is a really bad solution which will have negative impacts in experiences where you use your eyes to the extents like you do in real life - Unless it's somewhat synchronized with lens-flaws using resources which otherwise wouldn't be used.

icecold1983 · Mar 26, 2016

https://software.intel.com/sites/de...tions-and-DirectX-features-in-JC3_v0-92_X.pdf

just cause 3 dx12

dr_rus · Mar 26, 2016

icecold1983 said:
https://software.intel.com/sites/de...tions-and-DirectX-features-in-JC3_v0-92_X.pdf

just cause 3 dx12

I'm not sure that it's even about DX12 tbh. They mention DX12 only once describing ROV feature but the whole presentation is more about cutting stuff out for the game to work at 30 fps on Intel's iGPU. All features mentioned should be accessible via DX11.3 so this may be just a case of DX11 shader optimization.

Bl@de · Mar 26, 2016

dragonelite said:
Do the higher versions need some more processing time?

Yes and they are now available on the channel

MIMF · Mar 28, 2016

Anyone could download correctly the remedy paper from the OT? After downloading it I just end up with a broken file with all the sheets empty.

Dictator93 · Mar 28, 2016

Brian Karis already went over this kinda stuff on the UE4 stream/youtube channell, but seeing it again in a more methodical manner was nice.
Digital Humans: Crossing the Uncanny Valley in Unreal Engine 4 - GDC 2016

dr_rus · Mar 28, 2016

MIMF said:
Anyone could download correctly the remedy paper from the OT? After downloading it I just end up with a broken file with all the sheets empty.

Works fine here. Something must be interfering with the download on your connection.

MIMF · Mar 28, 2016

dr_rus said:
Works fine here. Something must be interfering with the download on your connection.

Thanks, I opened it with the PowerPoint online client from OneDrive and it finally opened well, for some reason my installed PowerPoint does not like the file.

Regarding the article/presentation, I was expecting much more information specially after all the recent news from the game rendering internals, pretty disappointing as it is just a DX12 small walkthrough.

icecold1983 · Mar 28, 2016

dx12 advancements

https://ky0vuq.bl3301.livefilestore...2F7bpT0yT0/DirectX 12 Advancements.pdf?psid=1

not specifically GDC related but interesting read. written by the lead graphics engineer at RAD

https://mynameismjp.wordpress.com/2016/03/25/bindless-texturing-for-deferred-rendering-and-decals/

dr_rus · Mar 28, 2016

A new PDF: DirectX 12 Advancements / Max McMullen, Direct3D Development Lead; Chas. Boyd, DirectX PM; Microsoft Silicon, Graphics and Media (SigMA)

Dammit... =)

MIMF said:
Thanks, I opened it with the PowerPoint online client from OneDrive and it finally opened well, for some reason my installed PowerPoint does not like the file.

Regarding the article/presentation, I was expecting much more information specially after all the recent news from the game rendering internals, pretty disappointing as it is just a DX12 small walkthrough.

Yeah, it's pretty basic.

dr_rus · Mar 30, 2016

https://developer.nvidia.com/vxao-voxel-ambient-occlusion

pottuvoi · Mar 30, 2016

Photogrammetry and Star Wars Battlefront.
Really great talk with nice tips on how to remove lighting from scanned textures.

Support NeoGAF

GDC 2016 papers and presentations

Member

Member

Member

Member

Member

Felium Defensor

Member

Banned

Member

Member

Member

Member

Banned

Banned

Member

Member

Member

Member

Todd Howard's Secret GAF Account

Mckmaster uses MasterCard to buy Slave drives

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Member

Banned

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Similar threads