Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 20, 2016

Face2Face – Facial Mimicry In Real-Time Video

Filed under: Verification,Video — Patrick Durusau @ 1:42 pm

Is a video enough for you to attribute quotes to a public figure?

After reading This system instantly edits videos to make it look like you’re saying something you’re not by Greg Kumparak, you may not be so sure.

From the post:


The video up top shows a work-in-progress system called Face2Face (research paper here) being built by researchers at Stanford, the Max Planck Institute and the University of Erlangen-Nuremberg.

The short version: take a YouTube video of someone speaking like, say, George W. Bush. Use a standard RGB webcam to capture a video of someone else emoting and saying something entirely different. Throw both videos into the Face2Face system and, bam, you’ve now got a relatively believable video of George W. Bush’s face — now almost entirely synthesized — doing whatever the actor in the second video wanted the target’s face to do. It even tries to work out what the interior of their mouth should look like as they’re speaking.

Face2Face: Real-time Face Capture and Reenactment of RGB Videos by Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner, offers the following abstract:

We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.

The video is most impressive:

If you want to dig deeper, consider from 2015: Real-time Expression Transfer for Facial Reenactment (PDF paper), by Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, Christian Theobalt.

With its separately impressive video:

The facial mimicry isn’t perfect by any means but it is remarkably good.

Not a prediction but full body mimicry in 5 years would not surprise me.

The surprise will be the first non-consenting subject of full body mimicry.

What would you want to see Donald (short-fingers) Trump doing with a pumpkin?

PS: Apologies, I wasn’t able to locate a PDF of the 2016 paper.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress