plenty of studios have large budgets, lots of time, and far largers staffs and cant hold a candle to naughty dogs work. cough ubisoft cough
if you think all developers would produce naughty dog level work given the same budget/time you are out of your mind
This is fantastic to see. Yes LA Noire was great but the amount of work and limitations were insane, it was clearly not feasible tech.
Specific game example? Please don't say an open world game lol.
oh the typical "but its open world" crux
fine, replace naughty dog with zero dawn. not that it really matters much for facial animation of the select few main protagonists anyway
Zero Dawn's facial animations were fine, but nothing compared to UC4.
I think you fail to realize that different developers have different goals, and if someone is making an open-world game, without a heavy focus on its characters and narrative, the facial animation isn't going to look as good as something like Uncharted where the focus is clearly the characters.
Then you add the extra time that 1st party devs get (Not to mention just having to work with one version) compared to multiplat games that only get 1 or 2 years of dev time for multiple versions of a game (Horizon had a 6 year development cycle) and we come to the conclusion that "it just boils down to talent and nothing else" is the wrong idea.
That's not to say Devs like Naughty Dog and Guerrilla aren't talented, they obviously are. But give them the same limitations and time constraints that other devs get and you'll see that their work will suffer as well.
As someone who knows nothing about programming the fact that they train programs to improve by themselves is so alien and so cool for me.Which is not a problem, since once it's been trained you don't need to do it again. And the beauty of it is that it can improve itself on its own even after it's been trained.
Are you seriously insinuating that there won't inherently be challenges in getting top tier facial quality out of a 3rd party multiplat open world game compared to a really linear 1st party game? Zero Dawn doesn't even have relatively good facial animation as the majority of scenes have dead eye system due to being procedurally generated, (not to mention the subpar camera work for this day and age).oh the typical "but its open world" crux
fine, replace naughty dog with zero dawn. not that it really matters much for facial animation of the select few main protagonists anyway
It's straight up a time and budget issue for the most part and then there's stuff like asset quality. Unity was the longest amount of time an AC game spent in development with the largest budget and they pushed for this level of asset quality. Hell there's straight up crossover between the studios in terms of the employees if you look at animator reels so to say "it's all talent" is pretty inaccurate, pretty much every large studio working on a triple A game is hiring the most talented animators in the industry so it's really how that team is used and how much time and money they're given. To drive the point one of the animators who worked on the facial animation of this scene, also did the facial animation for this. Why do you think the facial animation looks a decent margin worse in Lost Legacy than it does in UC4?
oh the typical "but its open world" crux
fine, replace naughty dog with zero dawn. not that it really matters much for facial animation of the select few main protagonists anyway
Seems like they have to stand extremly still from those videos. ND if I am not misstanken use the same mocap both for the actor movement and speech (the same shot). That to me seems far superior since you capture the whole actor in the moment and dont have to rely on combining two different ones.
Much of Naughty Dog's facial animation is done by hand, though. The idea behind this technology is that it makes that laborious process significantly easier if not unnecessary. Even if the actors had to do mocap for a given cutscene and then work through the script again for facial animation, that'd still be preferable to animators being tasked with the heavy lifting.
Doesn't Horizon zero dawn perfectly show the crux of open world? Compare the face animations from Horizon with Killzone Shadow fall. Same developer but Shadowfall has much more accurate face animations because they obviously had to do so much less.oh the typical "but its open world" crux
fine, replace naughty dog with zero dawn. not that it really matters much for facial animation of the select few main protagonists anyway
The point I was trying to make was that having a single shot is better than combining two not from a tecnical perspective but from the acting itself. Grabbing the person in the moment and how they use their entire body (headmovment, arms, legs, facial expression) combined with the talent of the manual animation I belive will provide a far better result than combining the movement with som facial tech where the actor can not even move the head from what it seems and could provide some really uncanny results IMO.
Much of Naughty Dog's facial animation is done by hand, though.
Y'all never gonna touch Medal Of Honor!
This is no longer true.
I know ND used facial capture this time around, but it's my understanding that there was still a fair amount of manual work involved in achieving the final result.
This is true for any performance capture pipeline. It still takes a lot of work to get performance capture looking good.
Baby steps..The target's ears don't move. That bugs me now
Apart from the lack of eye contact that ain't bad at all. Reminds me a bit of how good RE7s faces looked.
Shame about the hair :\
Need to be more clear what they're using the Neural Networks for.
What are they trying to predict and whether it's regression or classification.
It's certainly not transforming words and sounds into facial animation because it's copying his nonverbal facial animation. So what's the NN used for and why is it helpful when we already have good motion cap tech?
Right. Which circles this discussion back to what I said before about the current iteration of the technology being aimed at streamlining the process of creating nuanced facial animation.
ND no longer solely keyframes facial animation for cutscenes. Their models are now too complicated for that to be feasible. Actually the worst looking cutscenes in UC4 are those that were made before they started using facial capture.Much of Naughty Dog's facial animation is done by hand, though. The idea behind this technology is that it makes that laborious process significantly easier if not unnecessary. Even if the actors had to do mocap for a given cutscene and then work through the script again for facial animation, that'd still be preferable to animators being tasked with the heavy lifting.
Need to be more clear what they're using the Neural Networks for.
What are they trying to predict and whether it's regression or classification.
It's certainly not transforming words and sounds into facial animation because it's copying his nonverbal facial animation. So what's the NN used for and why is it helpful when we already have good motion cap tech?
Medal of Honor is a straight up case study in the uncanny valley. They missed the mark by such an absurd degree.Apart from the lack of eye contact that ain't bad at all. Reminds me a bit of how good RE7s faces looked.
Shame about the hair :\
plenty of studios have large budgets, lots of time, and far largers staffs and cant hold a candle to naughty dogs work. cough ubisoft cough
if you think all developers would produce naughty dog level work given the same budget/time you are out of your mind
ND no longer solely keyframes facial animation for cutscenes. Their models are now too complicated for that to be feasible. Actually the worst looking cutscenes in UC4 are those that were made before they started using facial capture.
That's pretty much every studio ever tho since the inception of performance capture in the industry. You can't just drop facial capture, (or any mocap data really), data on a model and call it a day. Another note is that, judging by all the animation reelsHe said "much of the facial animations in Uncharted 4" not all of the facial animations. basically eventhough they use facial captures they still do a lot of handwork on it later. The game's pseudo realistic art style sort of demands that the facial animations are exaggerated as well and hence eventhough they have the mocap reference they do a lot of fine tuning, probably more than other games in order to achieve their look.
I've already acknowledged that all other games do it and that's how it works. What I meant to say is that the balance between capture and animation by hand probably skews more towards animating by hand for Uncharted 4 than other games. Which would mean their amazing facial animation is primarily because of the artists than their capture technology.That's pretty much every studio ever tho since the inception of performance capture in the industry. You can't just drop facial capture, (or any mocap data really), data on a model and call it a day.
Yea I would still argue that that really isn't true.I've already acknowledged that all other games do it and that's how it works. What I meant to say is that the balance between capture and animation by hand probably skews more towards animating by hand for Uncharted 4 than other games. Which would mean their amazing facial animation is primarily because of the artists than their capture technology.
I've already acknowledged that all other games do it and that's how it works. What I meant to say is that the balance between capture and animation by hand probably skews more towards animating by hand for Uncharted 4 than other games. Which would mean their amazing facial animation is primarily because of the artists than their capture technology.
Yea I would still argue that that really isn't true.
it really isn't.
Yeah, it definitely should be a goal. However, as I speculated earlier (aside from the raw nature of this technique), you're still going to reduce animations to fit simpler facial models with fewer bones. You're almost certainly, also going to have to compress animation data.
For the foreseeable future that is still going to require a lot of manual labour to get the best results.
ND no longer solely keyframes facial animation for cutscenes. Their models are now too complicated for that to be feasible. Actually the worst looking cutscenes in UC4 are those that were made before they started using facial capture.
Sort of, I was curious and found the paper that this article is about. It's really cool, and neural nets are going to play a major role in this type of automation in the future, but this early implementation turns out to have some big limitations at the moment (basically, all the same limitations as L.A. Noire's facial capture):
1. Every single user needs to be calibrated independently for the system to work:
(That 5-10 minute dataset is the labor intensive part, and obviously doesn't only take 5-10 minutes to produce).
Similarly, while I'm sure later implementations of neural networks for this type of animation automation will have self-improvement, this one doesn't seem to. Not only does it lack self-improvement, but part of the hand-tweaked training footage used by the system outlined in this paper specifically calls for performance of the actor in character. While that helps with error correction for that singular character's performance, it's not robust enough yet to even error correct a single user with a fully neutral training dataset.
2. It turns out that while their intention is to eventually get this system up and running based on head-mounted face cams that are typically used for simultaneous facial and motion capture (like the one Ashley Johnson is wearing with the face markers in the Naughty Dog video linked above in this thread), their current implementation doesn't work with that. Instead all subsequent input footage provided to the neural net after it's finished learning from the training dataset is still coming from the same controlled environment and lighting setup used in training.
Again, I'm sure they'll get there, but it's not a trivial task that has already been handled. In fact, I'm guessing that they're going to need to rework the way they feed and tell the neural net how to interpret the arbitrary input footage, as right now all input footage is being converted to grayscale and cropped to 320x240 so the neural net looks at each frame as set of 76.8k scalar values (I'm assuming 0-255 range). Introducing a moving background and variable lighting as an actor shifts and moves in a scene doing simultaneous body and facial motion capture would disrupt a lot of what their current neural net is assuming it will receive from the incoming footage.
It's basically using the fact that the actor, background, and lighting setup are all constants to allow the system to look at each pixel as an advanced face tracking dot, but if any of those aspects are changed, it's like wiping the face tracking dots off an actor's face, or moving them to incorrect locations and screws with the automation.
So the system, as it exists right now, has almost all of the same constraints as the L.A. Noire facial capture system requiring the actor, environment, and lighting all be identical but produces higher quality 3D animations and doesn't require as elaborate a camera rig as LAN used. It'll get better, but it's not there yet, and this implementation certainly isn't learning and bettering its error correction over time beyond the initial training phase. Neural nets are definitely the future, though, and I stumbled across another video of a character animation system that was using a neural net to achieve realtime results for doing the equivalent of IK blended animation, only which looked much more natural than typical IK animation corrections in games.
Probably the coolest part I learned from the article is that this system doesn't use any temporal smoothing, but instead processes each frame completely independently of all others, and yet its results are so high quality that it appears perfectly smooth when played as animation. That's quite a feat in its own right and something hand-tweaking never achieves on its own.
"Okay the wireframe looks nice but why do they have the recorded actor in a frame and not the final work?
...
OH GOD THAT IS THE FINAL WORK ISN'T IT?"
Ehhh... no, it's not. It's the input video for the neural network.
... wait, when people say it looks amazing, they couldn't possibly be thinking the first frame is the generated CG, right?
The audio-driven one is pretty cool as well.