Browsing by Author "Bradley, Derek"
Now showing 1 - 11 of 11
Results Per Page
Sort Options
Item Accurate Real-time 3D Gaze Tracking Using a Lightweight Eyeball Calibration(The Eurographics Association and John Wiley & Sons Ltd., 2020) Wen, Quan; Bradley, Derek; Beeler, Thabo; Park, Seonwook; Hilliges, Otmar; Yong, Jun-Hai; Xu, Feng; Panozzo, Daniele and Assarsson, Ulf3D gaze tracking from a single RGB camera is very challenging due to the lack of information in determining the accurate gaze target from a monocular RGB sequence. The eyes tend to occupy only a small portion of the video, and even small errors in estimated eye orientations can lead to very large errors in the triangulated gaze target. We overcome these difficulties with a novel lightweight eyeball calibration scheme that determines the user-specific visual axis, eyeball size and position in the head. Unlike the previous calibration techniques, we do not need the ground truth positions of the gaze points. In the online stage, gaze is tracked by a new gaze fitting algorithm, and refined by a 3D gaze regression method to correct for bias errors. Our regression is pre-trained on several individuals and works well for novel users. After the lightweight one-time user calibration, our method operates in real time. Experiments show that our technique achieves state-of-the-art accuracy in gaze angle estimation, and we demonstrate applications of 3D gaze target tracking and gaze retargeting to an animated 3D character.Item Facial Animation with Disentangled Identity and Motion using Transformers(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Dominik L. Michels; Soeren PirkWe propose a 3D+time framework for modeling dynamic sequences of 3D facial shapes, representing realistic non-rigid motion during a performance. Our work extends neural 3D morphable models by learning a motion manifold using a transformer architecture. More specifically, we derive a novel transformer-based autoencoder that can model and synthesize 3D geometry sequences of arbitrary length. This transformer naturally determines frame-to-frame correlations required to represent the motion manifold, via the internal self-attention mechanism. Furthermore, our method disentangles the constant facial identity from the time-varying facial expressions in a performance, using two separate codes to represent neutral identity and the performance itself within separate latent subspaces. Thus, the model represents identity-agnostic performances that can be paired with an arbitrary new identity code and fed through our new identity-modulated performance decoder; the result is a sequence of 3D meshes for the performance with the desired identity and temporal length. We demonstrate how our disentangled motion model has natural applications in performance synthesis, performance retargeting, key-frame interpolation and completion of missing data, performance denoising and retiming, and other potential applications that include full 3D body modeling.Item Facial Expression Synthesis using a Global-Local Multilinear Framework(The Eurographics Association and John Wiley & Sons Ltd., 2020) Wang, Mengjiao; Bradley, Derek; Zafeiriou, Stefanos; Beeler, Thabo; Panozzo, Daniele and Assarsson, UlfWe present a practical method to synthesize plausible 3D facial expressions for a particular target subject. The ability to synthesize an entire facial rig from a single neutral expression has a large range of applications both in computer graphics and computer vision, ranging from the efficient and cost-effective creation of CG characters to scalable data generation for machine learning purposes. Unlike previous methods based on multilinear models, the proposed approach is capable to extrapolate well outside the sample pool, which allows it to plausibly predict the identity of the target subject and create artifact free expression shapes while requiring only a small input dataset. We introduce global-local multilinear models that leverage the strengths of expression-specific and identity-specific local models combined with coarse motion estimations from a global model. Experimental results show that we achieve high-quality, plausible facial expression synthesis results for an individual that outperform existing methods both quantitatively and qualitatively.Item Fast Nonlinear Least Squares Optimization of Large-Scale Semi-Sparse Problems(The Eurographics Association and John Wiley & Sons Ltd., 2020) Fratarcangeli, Marco; Bradley, Derek; Gruber, Aurel; Zoss, Gaspard; Beeler, Thabo; Panozzo, Daniele and Assarsson, UlfMany problems in computer graphics and vision can be formulated as a nonlinear least squares optimization problem, for which numerous off-the-shelf solvers are readily available. Depending on the structure of the problem, however, existing solvers may be more or less suitable, and in some cases the solution comes at the cost of lengthy convergence times. One such case is semi-sparse optimization problems, emerging for example in localized facial performance reconstruction, where the nonlinear least squares problem can be composed of hundreds of thousands of cost functions, each one involving many of the optimization parameters. While such problems can be solved with existing solvers, the computation time can severely hinder the applicability of these methods. We introduce a novel iterative solver for nonlinear least squares optimization of large-scale semi-sparse problems. We use the nonlinear Levenberg-Marquardt method to locally linearize the problem in parallel, based on its firstorder approximation. Then, we decompose the linear problem in small blocks, using the local Schur complement, leading to a more compact linear system without loss of information. The resulting system is dense but its size is small enough to be solved using a parallel direct method in a short amount of time. The main benefit we get by using such an approach is that the overall optimization process is entirely parallel and scalable, making it suitable to be mapped onto graphics hardware (GPU). By using our minimizer, results are obtained up to one order of magnitude faster than other existing solvers, without sacrificing the generality and the accuracy of the model. We provide a detailed analysis of our approach and validate our results with the application of performance-based facial capture using a recently-proposed anatomical local face deformation model.Item Graph-Based Synthesis for Skin Micro Wrinkles(The Eurographics Association and John Wiley & Sons Ltd., 2023) Weiss, Sebastian; Moulin, Jonathan; Chandran, Prashanth; Zoss, Gaspard; Gotardo, Paulo; Bradley, Derek; Memari, Pooran; Solomon, JustinWe present a novel graph-based simulation approach for generating micro wrinkle geometry on human skin, which can easily scale up to the micro-meter range and millions of wrinkles. The simulation first samples pores on the skin and treats them as nodes in a graph. These nodes are then connected and the resulting edges become candidate wrinkles. An iterative optimization inspired by pedestrian trail formation is then used to assign weights to those edges, i.e., to carve out the wrinkles. Finally, we convert the graph to a detailed skin displacement map using novel shape functions implemented in graphics shaders. Our simulation and displacement map creation steps expose fine controls over the appearance at real-time framerates suitable for interactive exploration and design. We demonstrate the effectiveness of the generated wrinkles by enhancing state-of-art 3D reconstructions of real human subjects with simulated micro wrinkles, and furthermore propose an artist-driven design flow for adding micro wrinkles to fictional characters.Item Improved Lighting Models for Facial Appearance Capture(The Eurographics Association, 2022) Xu, Yingyan; Riviere, Jérémy; Zoss, Gaspard; Chandran, Prashanth; Bradley, Derek; Gotardo, Paulo; Pelechano, Nuria; Vanderhaeghe, DavidFacial appearance capture techniques estimate geometry and reflectance properties of facial skin by performing a computationally intensive inverse rendering optimization in which one or more images are re-rendered a large number of times and compared to real images coming from multiple cameras. Due to the high computational burden, these techniques often make several simplifying assumptions to tame complexity and make the problem more tractable. For example, it is common to assume that the scene consists of only distant light sources, and ignore indirect bounces of light (on the surface and within the surface). Also, methods based on polarized lighting often simplify the light interaction with the surface and assume perfect separation of diffuse and specular reflectance. In this paper, we move in the opposite direction and demonstrate the impact on facial appearance capture quality when departing from these idealized conditions towards models that seek to more accurately represent the lighting, while at the same time minimally increasing computational burden. We compare the results obtained with a state-of-the-art appearance capture method [RGB*20], with and without our proposed improvements to the lighting model.Item Interactive Sculpting of Digital Faces Using an Anatomical Modeling Paradigm(The Eurographics Association and John Wiley & Sons Ltd., 2020) Gruber, Aurel; Fratarcangeli, Marco; Zoss, Gaspard; Cattaneo, Roman; Beeler, Thabo; Gross, Markus; Bradley, Derek; Jacobson, Alec and Huang, QixingDigitally sculpting 3D human faces is a very challenging task. It typically requires either 1) highly-skilled artists using complex software packages for high quality results, or 2) highly-constrained simple interfaces for consumer-level avatar creation, such as in game engines. We propose a novel interactive method for the creation of digital faces that is simple and intuitive to use, even for novice users, while consistently producing plausible 3D face geometry, and allowing editing freedom beyond traditional video game avatar creation. At the core of our system lies a specialized anatomical local face model (ALM), which is constructed from a dataset of several hundred 3D face scans. User edits are propagated to constraints for an optimization of our data-driven ALM model, ensuring the resulting face remains plausible even for simple edits like clicking and dragging surface points. We show how several natural interaction methods can be implemented in our framework, including direct control of the surface, indirect control of semantic features like age, ethnicity, gender, and BMI, as well as indirect control through manipulating the underlying bony structures. The result is a simple new method for creating digital human faces, for artists and novice users alike. Our method is attractive for low-budget VFX and animation productions, and our anatomical modeling paradigm can complement traditional game engine avatar design packages.Item Learning Dynamic 3D Geometry and Texture for Video Face Swapping(The Eurographics Association and John Wiley & Sons Ltd., 2022) Otto, Christopher; Naruniec, Jacek; Helminger, Leonhard; Etterlin, Thomas; Mignone, Graziana; Chandran, Prashanth; Zoss, Gaspard; Schroers, Christopher; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Weber, Romann; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneFace swapping is the process of applying a source actor's appearance to a target actor's performance in a video. This is a challenging visual effect that has seen increasing demand in film and television production. Recent work has shown that datadriven methods based on deep learning can produce compelling effects at production quality in a fraction of the time required for a traditional 3D pipeline. However, the dominant approach operates only on 2D imagery without reference to the underlying facial geometry or texture, resulting in poor generalization under novel viewpoints and little artistic control. Methods that do incorporate geometry rely on pre-learned facial priors that do not adapt well to particular geometric features of the source and target faces. We approach the problem of face swapping from the perspective of learning simultaneous convolutional facial autoencoders for the source and target identities, using a shared encoder network with identity-specific decoders. The key novelty in our approach is that each decoder first lifts the latent code into a 3D representation, comprising a dynamic face texture and a deformable 3D face shape, before projecting this 3D face back onto the input image using a differentiable renderer. The coupled autoencoders are trained only on videos of the source and target identities, without requiring 3D supervision. By leveraging the learned 3D geometry and texture, our method achieves face swapping with higher quality than when using offthe- shelf monocular 3D face reconstruction, and overall lower FID score than state-of-the-art 2D methods. Furthermore, our 3D representation allows for efficient artistic control over the result, which can be hard to achieve with existing 2D approaches.Item A Perceptual Shape Loss for Monocular 3D Face Reconstruction(The Eurographics Association and John Wiley & Sons Ltd., 2023) Otto, Christopher; Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Chaine, Raphaëlle; Deng, Zhigang; Kim, Min H.Monocular 3D face reconstruction is a wide-spread topic, and existing approaches tackle the problem either through fast neural network inference or offline iterative reconstruction of face geometry. In either case carefully-designed energy functions are minimized, commonly including loss terms like a photometric loss, a landmark reprojection loss, and others. In this work we propose a new loss function for monocular face capture, inspired by how humans would perceive the quality of a 3D face reconstruction given a particular image. It is widely known that shading provides a strong indicator for 3D shape in the human visual system. As such, our new 'perceptual' shape loss aims to judge the quality of a 3D face estimate using only shading cues. Our loss is implemented as a discriminator-style neural network that takes an input face image and a shaded render of the geometry estimate, and then predicts a score that perceptually evaluates how well the shaded render matches the given image. This 'critic' network operates on the RGB image and geometry render alone, without requiring an estimate of the albedo or illumination in the scene. Furthermore, our loss operates entirely in image space and is thus agnostic to mesh topology. We show how our new perceptual shape loss can be combined with traditional energy terms for monocular 3D face optimization and deep neural network regression, improving upon current state-of-the-art results.Item Practical Person-Specific Eye Rigging(The Eurographics Association and John Wiley & Sons Ltd., 2019) Bérard, Pascal; Bradley, Derek; Gross, Markus; Beeler, Thabo; Alliez, Pierre and Pellacini, FabioWe present a novel parametric eye rig for eye animation, including a new multi-view imaging system that can reconstruct eye poses at submillimeter accuracy to which we fit our new rig. This allows us to accurately estimate person-specific eyeball shape, rotation center, interocular distance, visual axis, and other rig parameters resulting in an animation-ready eye rig. We demonstrate the importance of several aspects of eye modeling that are often overlooked, for example that the visual axis is not identical to the optical axis, that it is important to model rotation about the optical axis, and that the rotation center of the eye should be measured accurately for each person. Since accurate rig fitting requires hand annotation of multi-view imagery for several eye gazes, we additionally propose a more user-friendly ''lightweight'' fitting approach, which leverages an average rig created from several pre-captured accurate rigs. Our lightweight rig fitting method allows for the estimation of eyeball shape and eyeball position given only a single pose with a known look-at point (e.g. looking into a camera) and few manual annotations.Item Shape Transformers: Topology-Independent 3D Shape Models Using Transformers(The Eurographics Association and John Wiley & Sons Ltd., 2022) Chandran, Prashanth; Zoss, Gaspard; Gross, Markus; Gotardo, Paulo; Bradley, Derek; Chaine, Raphaëlle; Kim, Min H.Parametric 3D shape models are heavily utilized in computer graphics and vision applications to provide priors on the observed variability of an object's geometry (e.g., for faces). Original models were linear and operated on the entire shape at once. They were later enhanced to provide localized control on different shape parts separately. In deep shape models, nonlinearity was introduced via a sequence of fully-connected layers and activation functions, and locality was introduced in recent models that use mesh convolution networks. As common limitations, these models often dictate, in one way or another, the allowed extent of spatial correlations and also require that a fixed mesh topology be specified ahead of time. To overcome these limitations, we present Shape Transformers, a new nonlinear parametric 3D shape model based on transformer architectures. A key benefit of this new model comes from using the transformer's self-attention mechanism to automatically learn nonlinear spatial correlations for a class of 3D shapes. This is in contrast to global models that correlate everything and local models that dictate the correlation extent. Our transformer 3D shape autoencoder is a better alternative to mesh convolution models, which require specially-crafted convolution, and down/up-sampling operators that can be difficult to design. Our model is also topologically independent: it can be trained once and then evaluated on any mesh topology, unlike most previous methods. We demonstrate the application of our model to different datasets, including 3D faces, 3D hand shapes and full human bodies. Our experiments demonstrate the strong potential of our Shape Transformer model in several applications in computer graphics and vision.