Fully Controllable Data Generation For Realtime Face Capture

dc.contributor.authorChandran, Prashanth
dc.date.accessioned2023-12-06T07:43:08Z
dc.date.available2023-12-06T07:43:08Z
dc.date.issued2023-01-31
dc.description.abstractData driven realtime face capture has gained considerable momentum in the last few years thanks to deep neural networks that leverage specialized datasets to speedup the acquisition of face geometry and appearance. However generalizing such neural solutions to generic in-the-wild face capture continues to remain a challenge due to the lack of, or a means to generate a high quality in-the-wild face database with all forms of groundtruth (geometry, appearance, environment maps, etc.). In this thesis we recognize this data bottleneck and propose a comprehensive framework for controllable, high quality, in-the-wild data generation that can support present and future applications in face capture. We approach this problem in four stages starting with the building of a high quality 3D face database consisting of a few hundred subjects in a studio setting. This database will serve as a strong prior for 3D face geometry and appearance for several methods discussed in this thesis. To build this 3D database and to automate the registration of scans to a template mesh, we propose the first deep facial landmark detector capable of operating on 4K resolution imagery while also achieving state-of-the-art performance on several in-the-wild benchmarks. Our second stage leverages the proposed 3D face database to build powerful nonlinear 3D morphable models for static geometry modelling and synthesis. We propose the first semantic deep face model that combines the semantic interpretability of traditional 3D morphable models with the nonlinear expressivity of neural networks. We later extend this semantic deep face model with a novel transformer based architecture and propose the Shape Transformer, for representing and manipulating face shapes irrespective of their mesh connectivity. The third stage of our data generation pipeline involves extending the approaches for static geometry synthesis to support facial deformations across time so as to synthesize dynamic performances. To synthesize facial performances we propose two parallel approaches, one involving performance retargeting and another based on a data driven 4D (3D + time) morphable model. We propose a local anatomically constrained facial performance retargeting technique that uses only a handful of blendshapes (20 shapes) to achieve production quality results. This retargeting technique can readily be used to create novel animations for any given actor via animation transfer. Our second contribution for generating facial performances is through a transformer based 4D autoencoder that encodes a sequence of expression blend weights into a learned performance latent space. Novel performances can then be generated at inference time by sampling this learned latent space. The fourth and final stage of our data generation pipeline involves the creation of photorealistic imagery that can go along with the facial geometry and animations synthesized thus far. We propose a hybrid rendering approach that leverages state-of-the-art techniques for ray traced skin rendering and a pretrained 2D generative model for photorealistic and consistent inpainting of the skin renders. Our hybrid rendering technique allows for the creation of an infinite number of training samples where the user has full control over the facial geometry, appearance, lighting and viewpoint. The techniques presented in this thesis will serve as the foundation for creating large scale photorealistic in-the-wild face datasets to support the next generation of realtime face capture.en_US
dc.identifier.urihttps://diglib.eg.org:443/handle/10.2312/3543923
dc.language.isoen_USen_US
dc.publisherETH Zurichen_US
dc.subjectFacial Performance Captureen_US
dc.subjectAnimationen_US
dc.subjectGenerative Modelsen_US
dc.subjectRetargetingen_US
dc.subjectNeural Renderingen_US
dc.subjectShape Modelingen_US
dc.subjectShape Generationen_US
dc.subjectMorphable Modelen_US
dc.subjectKeypoint Detectionen_US
dc.subjectFace Trackingen_US
dc.subjectPerformance Captureen_US
dc.subjectData Driven Animationen_US
dc.subjectMotion Captureen_US
dc.titleFully Controllable Data Generation For Realtime Face Captureen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Fully_Controllable_Data_Generation_For_Realtime_Face_Capture_PrashanthChandran_2023.pdf
Size:
84.19 MB
Format:
Adobe Portable Document Format
Description:
Main Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections