Support NeoGAF

dex3108 · Aug 2, 2017

Interesting use of technology.

Remedy, the developer behind the likes of Alan Wake and Quantum Break, has teamed up with GPU-maker Nvidia to streamline one of the more costly parts of modern games development: motion capture and animation. As showcased at Siggraph, by using a deep learning neural networkrun on Nvidia's costly eight-GPU DGX-1 server, naturallyRemedy was able to feed in videos of actors performing lines, from which the network generated surprisingly sophisticated 3D facial animation. This, according Remedy and Nvidia, removes the hours of "labour-intensive data conversion and touch-ups" that are typically associated with traditional motion capture animation.

Aside from cost, facial animation, even when motion captured, rarely reaches the same level of fidelity as other animation. That odd, lifeless look seen in even the biggest of blockbuster games is often down to the limits of facial animation. Nvidia and Remedy believe its neural network solution is capable of producing results as good, if not better than that produced by traditional techniques. It's even possible to skip the video altogether and feed the neural network a mere audio clip, from which it's able to produce an animation based on prior results.

https://www.youtube.com/watch?v=VtttfrmfMZw

https://arstechnica.com/gaming/2017/08/nvidia-remedy-neural-network-facial-animation/?amp=1

Bit-Bit · Aug 2, 2017

Whoa, movies and games are about to get a whole lot more uncanny with their human characters.

Ushay · Aug 2, 2017

Oh boy that was incredible. Cannot wait to see what Remedy produce next, those guys have always been cinematic wizards, it's sad how underrated their games are.

Iriscomeback · Aug 2, 2017

LA Noire looked amazing, why don't they use that technology?

Not saying this looks bad

faridmon · Aug 2, 2017

Wow, I was pretty meh on it until he moved around his eyebrows and forehead.

Mind blowing stuff

Iriscomeback said:
LA Noire looked amazing, why don't they use that technology?

That wasn't a very efficient use of performance capture. It was laborious, expensive and time consuming. on top of the facial performance capture, animators had to use external and post-production effects such as adding polygons and textures to reach that facial expressions.

vivekTO · Aug 2, 2017

Iriscomeback said:
LA Noire looked amazing, why don't they use that technology?

Not saying this looks bad

Because that tech is not feasible to use with motion capture of the performers, they have to sit in front of array of cameras to capture the facial video and than superimpose them on the face mesh. Until unless they figure out something to capture the performance as well, i don't think that tech can be widely use.

Bits N Pieces · Aug 2, 2017

Holy shit this looks incredible!

Cannot wait to see what Remedy have cooking

Renekton · Aug 2, 2017

DICE needs to buy this, if it works

Makes life easier for their partners

Vic · Aug 2, 2017

The second character is amazing 😮.

nOoblet16 · Aug 2, 2017

Iriscomeback said:
LA Noire looked amazing, why don't they use that technology?

Not saying this looks bad

Because that tech was a dead end.
It's wasn't really facial animation as much as it was just a video slapped onto where the face of the character should be.

The limitations of that technique:
1) The ingame model looks exactly like the actor, you cannot use a different model
2) The actors had to sit completely still
3) Couldn't mo cap body and capture facial animations at the same time.
4) They can't retarget the eyes of the model because it's not actually an eye but just part of the video. Meaning there can be disparity in cutscenes.
5) The faces can't really be lit properly
6) It also limits the quality of the shaders

Basically too many limitations and it was a one off case where it managed to work well. For s game with similar style cutscenes, Mafia 3 has amazing facial animations.

laxu · Aug 2, 2017

This looks great but doesn't it have the same issue as the LA Noire tech where the subject would have to emote without moving their head? Or could this be implemented in a motion capture setup so that actors can use their whole body? Or would this be something to use for a canned set of responses where animators would compose a cutscene or discussion from the responses and maybe body pose change animations?

peppers · Aug 2, 2017

This plus their lighting technology will probably make Remedy's next game absolutely mind blowing in the graphics department.

Magypsy · Aug 2, 2017

nOoblet16 said:
Because that tech was a dead end.
It's wasn't really facial animation as much as it was just a video slapped onto where the face of the character should be.

The limitations of that technique:
1) The ingame model looks exactly like the actor, you cannot use a different model
2) The actors had to sit completely still
3) Couldn't mo cap body and capture facial animations at the same time.
4) They can't retarget the eyes of the model because it's not actually an eye but just part of the video. Meaning there can be disparity in cutscenes.
5) The faces can't really be lit properly
6) It also limits the quality of the shaders

Basically too many limitations and it was a one off case where it managed to work well. For s game with similar style cutscenes, Mafia 3 has amazing facial animations.

7) Lots of cameras needed; very expensive.

This new neural network tech can be done veeerry cheaply. Hell, train it more and you could feasibly use video of any shape size and angle.

Maxey · Aug 2, 2017

Iriscomeback said:
LA Noire looked amazing, why don't they use that technology?

Not saying this looks bad

Yeah, the lack of depth inside the mouth of the characters sure looked amazing.

It looked good at points but it was a very limited technology.

The new techniques are much better.

SomTervo · Aug 2, 2017

Iriscomeback said:
LA Noire looked amazing, why don't they use that technology?

Not saying this looks bad

As well as all the already-mentioned reasons, I'm pretty sure the LA Noire method was also vastly inefficient in terms of space, which is partly why the game's install size was so massive and it was over 3 DVDs. Could be wrong about that though.

EvB · Aug 2, 2017

Bits N Pieces said:
Holy shit this looks incredible!

Cannot wait to see what Remedy have cooking

That's just the standard mo-cop tech.

tuxfool · Aug 2, 2017

Magypsy said:
7) Lots of cameras needed; very expensive.

This new neural network tech can be done veeerry cheaply. Hell, train it more and you could feasibly use video of any shape size and angle.

You still need cameras to capture the performance, this method only works for facial capture. In terms of facial capture (on today's methods) you're basically talking about reducing from 2 cameras to one.

This neural net doesn't change the practical setup needed for performance capture, only the procedural ones such as data conversion.

EvB · Aug 2, 2017

Magypsy said:
7) Lots of cameras needed; very expensive.

This new neural network tech can be done veeerry cheaply. Hell, train it more and you could feasibly use video of any shape size and angle.

Yep, that's the cool thing. It reduces cost and time in cleanup

Strings · Aug 2, 2017

EvB said:
That's just the standard mo-cop tech.

Yeah, I'm a bit confused here. This is pretty good, but it's not leagues ahead of the competition: https://www.youtube.com/watch?v=IA4bmiXNMoo

tuxfool · Aug 2, 2017

Strings said:
Yeah, I'm a bit confused here. This is pretty good, but it's not leagues ahead of the competition: https://www.youtube.com/watch?v=IA4bmiXNMoo

The video in the OP completely buries the lede.

It is about making the pipeline faster and make the software and manual labor (required to treat mocap data) more streamlined.

As you note, that video does not really demonstrate a massive step up from the state of the art otherwise.

Magypsy · Aug 2, 2017

tuxfool said:
You still need cameras to capture the performance, this method only works for facial capture. In terms of facial capture (on today's methods) you're basically talking about reducing from 2 cameras to one.

This neural net doesn't change the practical setup needed for performance capture, only the procedural ones such as data conversion.

You're right about standard facial capture, but I was talking about LA Noire's facial capture which required a whole array of cameras.

Bits N Pieces · Aug 2, 2017

EvB said:
That's just the standard mo-cop tech.

Sorry I meant that it looks incredible compared to the amount of time and resourse that it takes to make something comparable.

jediyoshi · Aug 2, 2017

Strings said:
Yeah, I'm a bit confused here. This is pretty good, but it's not leagues ahead of the competition: https://www.youtube.com/watch?v=IA4bmiXNMoo

Watch both videos again, keep in mind what the actors are having to deal with equipment wise.

EvB · Aug 2, 2017

Strings said:
Yeah, I'm a bit confused here. This is pretty good, but it's not leagues ahead of the competition: https://www.youtube.com/watch?v=IA4bmiXNMoo

It's not about it being better or worse than that other super expensive method of motion capture. Of course you are going to get good results if you are walking around and entire state of the art mo-cap facilty
It's about the high quality results that you can achieve super easily.
People complained about Mass Effect's facial animation. Well this means that there is nothing to stop a developer simply recording the actors as they read their lines and getting more than useable lip sync info.

This shows the footage they captured from. (input frame)
Target are the results using an expensive multi camera setup/ conventional mo-cap setup that big productions use.
Output shows the neural network's attempt using just a single standard video feed with no 3d data.

Basically mo-cap from a smartphone would be possible.

This image shows another 3 attempts by different teams to achieve the same thing, as you can see, they are not even attempt to the gold standard method with all the expensive kit or the neural network method from a single camera feed.

tuxfool · Aug 2, 2017

Magypsy said:
You're right about standard facial capture, but I was talking about LA Noire's facial capture which required a whole array of cameras.

Oh, right. As others have noted that process was at a dead end, and it wasn't even about the requirement of multiple cameras.

pottuvoi · Aug 2, 2017

They also presented audio driven animation tech.
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Sould be great for games with huge amount of NPCs.

tuxfool · Aug 2, 2017

jediyoshi said:
Watch both videos again, keep in mind what the actors are having to deal with equipment wise.

In performance capture you're still going to need a head camera attached to a helmet.

pottuvoi said:
They also presented audio driven animation tech.
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Sould be great for games with huge amount of NPCs.

I do question the true practicality of this method for games. The real trick in applying mocap to games is the ability to use low poly and low bone/blendshape count models and still have it look right. There is also the question that raw animation data like this is often in the hundreds of megabytes in size, which is really impractical for use in games.

The methods here may deal with it, but they don't touch upon these considerations.

icecold1983 · Aug 2, 2017

maybe a way to get the lass talented and funded studios to produce animation work on par with titles like uncharted 4

EvB · Aug 2, 2017

icecold1983 said:
maybe a way to get the lass talented and funded studios to produce animation work on par with titles like uncharted 4

What?
It's got very little to do with talent

vivekTO · Aug 2, 2017

icecold1983 said:
maybe a way to get the lass talented and funded studios to produce animation work on par with titles like uncharted 4

Hellblade | Ninja Theory is aiming the same, Less expensive better result with there Face tech, and i think that looks beautiful.

icecold1983 · Aug 2, 2017

EvB said:
What?
It's got very little to do with talent

it has a hell of a lot to do with talent

Alexander DeLarge · Aug 2, 2017

Wonder when we're going to see Remedy's

Max Payne successor

, I mean, "Project 7". I'm hyped after that "shoot dodge" reference Sam Lake gave when talking about it.

Dryk · Aug 2, 2017

This sort of technology has the potential to save thousands of hours of labour per game. It reallly is incredible.

Vincent Grayson · Aug 2, 2017

icecold1983 said:
it has a hell of a lot to do with talent

It has far more to do with budget and time than talent.

icecold1983 · Aug 2, 2017

Vincent Grayson said:
It has far more to do with budget and time than talent.

plenty of studios have large budgets, lots of time, and far largers staffs and cant hold a candle to naughty dogs work. cough ubisoft cough

if you think all developers would produce naughty dog level work given the same budget/time you are out of your mind

Popeck · Aug 2, 2017

Next step: apply neural networks to design THE PERFECT GAME. Dem neural networks, is there anything they cannot do?

Springfoot · Aug 2, 2017

cheezcake · Aug 2, 2017

Popeck said:
Next step: apply neural networks to design THE PERFECT GAME. Dem neural networks, is there anything they cannot do?

Love :'(

nOoblet16 · Aug 2, 2017

EvB said:
That's just the standard mo-cop tech.

Strings said:
Yeah, I'm a bit confused here. This is pretty good, but it's not leagues ahead of the competition: https://www.youtube.com/watch?v=IA4bmiXNMoo

Watch the video again.
Do you see the actor wearing any dots on his face for capture purpose? That should tell you it's not standard mo cap.

Standard mocap just uses reference points from an actor's face and modified the model accordingly, but at the end of the day the animation is still incomplete. There's only so much data you can gather from using standard mo cap and with 20-30 reference points. You still have to add in the additional detail by hand to make the expression really work.

This provides a much faster way by reducing the amount of hand work it'll require. Lastly, it's doing all of that without using any dots as reference point. That's what's truly impressive about it.

It's not meant to deliver results more realistic than what we have, it's meant to deliver equally realistic results at a faster rate.

cm osi · Aug 2, 2017

technology is amazing but the more we get closer to photorealism the more i shift toward fantasy

nOoblet16 · Aug 2, 2017

Springfoot said:
They've already done one better. How about no video at all?

From the article:

It's like the phoneme-driven animation Valve used for HL2 but on steroids.

The main trick with this entire process is that it lives and dies based on the initial samples provided to the neural net. The article mentions that it needs about 10 minutes of really high quality, hand tuned 3D facial mocap first that it can learn from.

Which is not a problem, since once it's been trained you don't need to do it again. And the beauty of it is that it can improve itself on its own even after it's been trained.

Springfoot · Aug 2, 2017

IbizaPocholo · Aug 3, 2017

Amazing!

flipswitch · Aug 3, 2017

nOoblet16 said:
Because that tech was a dead end.
It's wasn't really facial animation as much as it was just a video slapped onto where the face of the character should be.

The limitations of that technique:
1) The ingame model looks exactly like the actor, you cannot use a different model
2) The actors had to sit completely still
3) Couldn't mo cap body and capture facial animations at the same time.
4) They can't retarget the eyes of the model because it's not actually an eye but just part of the video. Meaning there can be disparity in cutscenes.
5) The faces can't really be lit properly
6) It also limits the quality of the shaders

Basically too many limitations and it was a one off case where it managed to work well. For s game with similar style cutscenes, Mafia 3 has amazing facial animations.

What about the fps being stuck at 30fps for the animations?

Last_colossi · Aug 3, 2017

This is awesome now we just need to mix that with this: Youtube

Prophet Steve · Aug 3, 2017

flipswitch said:
What about the fps being stuck at 30fps for the animations?

I suppose you could record higher but I think the data size was getting pretty big too.

rubius01 · Aug 3, 2017

Bits N Pieces said:
Holy shit this looks incredible!

Cannot wait to see what Remedy have cooking

The target's ears don't move. That bugs me now

Aztechnology · Aug 3, 2017

This is incredible. As someone who had to do minor rig work in animation at one point (Just in school, switched majors since) and have also witnessed how much work goes into Mocap stuff I really can appreciate how big this has the potential to be. Thanks people for sharing additional info, pictures etc that help put it into even better perspective though.

disappeared · Aug 3, 2017

"When he wakes up, (bleeeeeeeeeeeeeeeeeeeeeeeeeeeeeeep)."

>.>

<.<

Neptonic · Aug 3, 2017

That smirk near the end is so good

Support NeoGAF

Nvidia and Remedy use neural networks for eerily good facial animation

Member

Member

Member

Member

Member

Member

Member

Member

Please help me with my bad english

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Banned

Member

Member

Member

Member

Banned

Banned

Banned

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

NeoGAFs Kent Brockman

Member

Member

Member

Member

Member

Banned

Member

Similar threads