2021

Permanent URI for this collection


Mitigating Soft-Biometric Driven Bias and Privacy Concerns in Face Recognition Systems

Terhörst, Philipp

Automation for camera-only 6D object detection

Rojtberg, Pavel

Merging, extending and learning representations for 3D shape matching

Marin, Riccardo

Extending the Design Space of E-textile Assistive Smart Environment Applications

Rus, Silvia Dorotheea

Hybrid Modelling of Heterogeneous Volumetric Objects

Tereshin, Alexander

Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans

Ferstl, Ylva

Real-time Human Performance Capture and Synthesis

Habermann, Marc Dr.-Ing.

Static inverse modelling of cloth

Ly, Mickaël

Exploring Parameters of Virtual Character Lighting Through Perceptual Evaluation and Psychophysical Modelling

Wisessing, Pisut

Acquisition, Encoding and Rendering of Material Appearance Using Compact Neural Bidirectional Texture Functions

Rainer, Gilles

Data-Driven Face Analysis for Performance Retargeting

Zoss, Gaspard

A Search for Shape

van Blokland, Bart Iver

Computational Analysis and Design of Structurally Stable Assemblies with Rigid Parts

Wang, Ziqi

Concepts and methods to support the development and evaluation of remote collaboration using augmented reality

Marques, Bernardo


Browse

Recent Submissions

Now showing 1 - 14 of 14
  • Item
    Mitigating Soft-Biometric Driven Bias and Privacy Concerns in Face Recognition Systems
    (2021-04-20) Terhörst, Philipp
    Biometric verification refers to the automatic verification of a person’s identity based on their behavioural and biological characteristics. Among various biometric modalities, the face is one of the most widely used since it is easily acquirable in unconstrained environments and provides a strong uniqueness. In recent years, face recognition systems spread world-wide and are increasingly involved in critical decision-making processes such as finance, public security, and forensics. The growing effect of these systems on everybody’s daily life is driven by the strong enhancements in their recognition performance. The advances in extracting deeply-learned feature representations from face images enabled the high-performance of current face recognition systems. However, the success of these representations came at the cost of two major discriminatory concerns. These concerns are driven by soft-biometric attributes such as demographics, accessories, health conditions, or hairstyles. The first concern is about bias in face recognition. Current face recognition solutions are built on representation-learning strategies that optimize total recognition performance. These learning strategies often depend on the underlying distribution of soft-biometric attributes in the training data. Consequently, the behaviour of the learned face recognition solutions strongly varies depending on the individual’s soft-biometrics (e.g. based on the individual’s ethnicity). The second concern tackles the user’s privacy in such systems. Although face recognition systems are trained to recognize individuals based on face images, the deeply-learned representation of an individual contains more information than just the person’s identity. Privacy-sensitive information such as demographics, sexual orientation, or health status, is encoded in such representations. However, for many applications, the biometric data is expected to be used for recognition only and thus, raises major privacy issues. The unauthorized access of such individual’s privacy-sensitive information can lead to unfair or unequal treatment of this individual. Both issues are caused by the presence of soft-biometric attribute information in the face images. Previous research focused on investigating the influence of demographic attributes on both concerns. Consequently, the solutions from previous works focused on the mitigation of demographic-concerns only as well. Moreover, these approaches require computationally-heavy retraining of the deployed face recognition model and thus, are hardly-integrable into existing systems. Unlike previous works, this thesis proposes solutions to mitigating soft-biometric driven bias and privacy concerns in face recognition systems that are easily-integrable in existing systems and aim for more comprehensive mitigation, not limited to pre-defined demographic attributes. This aims at enhancing the reliability, trust, and dissemination of these systems. The first part of this work provides in-depth investigations on soft-biometric driven bias and privacy concerns in face recognition over a wide range of soft-biometric attributes. The findings of these investigations guided the development of the proposed solutions. The investigations showed that a high number of soft-biometric and privacy-sensitive attributes are encoded in face representations. Moreover, the presence of these soft-biometric attributes strongly influences the behaviour of face recognition systems. This demonstrates the strong need for more comprehensive privacy-enhancing and bias-mitigating technologies that are not limited to pre-defined (demographic) attributes. Guided by these findings, this work proposes solutions for mitigating bias in face recognition systems and solutions for the enhancement of soft-biometric privacy in these systems. The proposed bias-mitigating solutions operate on the comparison- and score-level of recognition system and thus, can be easily integrated. Incorporating the notation of individual fairness, that aims at treating similar individuals similarly, strongly mitigates bias of unknown origins and further improves the overall-recognition performance of the system. The proposed solutions for enhancing the soft-biometric privacy in face recognition systems either manipulate existing face representations directly or changes the representation type including the inference process for verification. The manipulation of existing face representations aims at directly suppressing the encoded privacy-risk information in an easily-integrable manner. Contrarily, the inference-level solutions indirectly suppress this privacy-risk information by changing the way of how this information is encoded. To summarise, this work investigates soft-biometric driven bias and privacy concerns in face recognition systems and proposed solutions to mitigate these. Unlike previous works, the proposed approaches are (a) highly effective in mitigating these concerns, (b) not limited to the mitigation of concerns origin from specific attributes, and (c) easily-integrable into existing systems. Moreover, the presented solutions are not limited to face biometrics and thus, aim at enhancing the reliability, trust, and dissemination of biometric systems in general.
  • Item
    Automation for camera-only 6D object detection
    (2021-04-21) Rojtberg, Pavel
    Today a widespread deployment of Augmented Reality (AR) systems is only possible by means of computer vision frameworks like ARKit and ARCore, which abstract from specific devices, yet restrict the set of devices to the respective vendor. This thesis therefore investigates how to allow deploying AR systems to any device with an attached camera. One crucial part of an AR system is the detection of arbitrary objects in the camera frame and naturally accompanying the estimation of their 6D-pose. This increases the degree of scene understanding that AR applications require for placing augmentations in the real world. Currently, this is limited by a coarse segmentation of the scene into planes as provided by the aforementioned frameworks. Being able to reliably detect individual objects, allows attaching specific augmentations as required by e.g. AR maintenance applications. For this, we employ convolutional neural networks (CNNs) to estimate the 6D-pose of all visible objects from a single RGB image. Here, the addressed challenge is the automated training of the respective CNN models, given only the CAD geometry of the target object. First, we look at reconstructing the missing surface data in real-time before we turn to the more general problem of bridging the domain gap between the non-photorealistic representation and the real world appearance. To this end, we build upon generative adversarial network (GAN) models to formulate the domain gap as an unsupervised learning problem. Our evaluation shows an improvement in model performance, while providing a simplified handling compared to alternative solutions. Furthermore, the calibration data of the used camera must be known for precise pose estimation. This data, again, is only available for the restricted set of devices, that the proprietary frameworks support. To lift this restriction, we propose a web-based camera calibration service that not only aggregates calibration data, but also guides users in the calibration of new cameras. Here, we first present a novel calibration-pose selection framework that reduces the number of required calibration images by 30% compared to existing solutions, while ensuring a repeatable and reliable calibration outcome. Then, we present an evaluation of different user-guidance strategies, which allows choosing a setting suitable for most users. This enables even novice users to perform a precise camera calibration in about 2 minutes. Finally, we propose an efficient client-server architecture to deploy the aforementioned guidance on the web, making it available to the widest possible range of devices. This service is not restricted to AR systems, but allows the general deployment of computer vision algorithms on the web that rely on camera calibration data, which was previously not possible. These elements combined, allow a semi-automatic deployment of AR systems with any camera to detect any object.
  • Item
    Merging, extending and learning representations for 3D shape matching
    (2021-05-15) Marin, Riccardo
    In the last decades, researchers devoted considerable attention to shape matching. Correlating surfaces unlocks otherwise impossible applications and analysis. However, non-rigid objects (like humans) have an enormous range of possibilities to deform their surfaces, making the correspondence challenging to obtain. Computer Graphics and Vision has developed many different representations, each with its peculiarities, conveying different properties and easing different tasks. In this thesis, we exploit, extend, and propose representations to establish correspondences in the non-rigid domain. First, we show how the latent representation of a morphable model can be combined with the spectral embedding, acting as regularization of registration pipelines. We fill the gap in unconstrained problems like occlusion in RGB+D single view or partiality and topological noise for 3D representations. Furthermore, we define a strategy to densify the morphable model discretization and catch variable quantities of details. We also analyze how different discretizations impact correspondence computation. Therefore, we combine intrinsic and extrinsic embeddings, obtaining a robust representation that lets us transfer triangulation among the shapes. Data-driven techniques are particularly relevant to catch complex priors. Hence, we use deep learning techniques to obtain a new high-dimensional embedding for point clouds; in this representation, the objects align with a linear transformation. This approach shows resilience to sparsity and noise. Finally, we connect super-compact latent representations by linking autoencoder latent codes with Laplace-Beltrami operator spectra. This strategy lets us solving a complicated historical problem, enriching the learning framework with geometric properties, and matching objects regardless of their representations. The main contributions of this thesis are the theoretical and practical studies of representations, the advancement in shape matching, and finally, the data and code produced and publicly available.
  • Item
    Extending the Design Space of E-textile Assistive Smart Environment Applications
    (2021-06-30) Rus, Silvia Dorotheea
    The thriving field of Smart Environments has allowed computing devices to gain new capabilities and develop new interfaces, thus becoming more and more part of our lives. In many of these areas it is unthinkable to renounce to the assisting functionality such as e.g. comfort and safety functions during driving, safety functionality while working in an industrial plant, or self-optimization of daily activities with a Smartwatch. Adults spend a lot of time on flexible surfaces such as in the office chair, in bed or in the car seat. These are crucial parts of our environments. Even though environments have become smarter with integrated computing gaining new capabilities and new interfaces, mostly rigid surfaces and objects have become smarter. In this thesis, I build on the advantages flexible and bendable surfaces have to offer and look into the creation process of assistive Smart Environment applications leveraging these surfaces. I have done this with three main contributions. First, since most Smart Environment applications are built-in into rigid surfaces, I extend the body of knowledge by designing new assistive applications integrated in flexible surfaces such as comfortable chairs, beds, or any type of soft, flexible objects. These developed applications offer assistance e.g. through preventive functionality such as decubitus ulcer prevention while lying in bed, back pain prevention while sitting on a chair or emotion detection while detecting movements on a couch. Second, I propose a new framework for the design process of flexible surface prototypes and its challenges of creating hardware prototypes in multiple iterations, using resources such as work time and material costs. I address this research challenge by creating a simulation framework which can be used to design applications with changing surface shape. In a first step I validate the simulation framework by building a real prototype and a simulated prototype and compare the results in terms of sensor amount and sensor placement. Furthermore, I use this developed simulation framework to analyse the influence it has on an application design if the developer is experienced or not. Finally, since sensor capabilities play a major role during the design process, and humans come often in contact with surfaces made of fabric, I combine the integration advantages of fabric and those of capacitive proximity sensing electrodes. By conducting a multitude of capacitive proximity sensing measurements, I determine the performance of electrodes made by varying properties such as material, shape, size, pattern density, stitching type, or supporting fabric. I discuss the results from this performance evaluation and condense them into e-textile capacitive sensing electrode guidelines, applied exemplary on the use case of creating a bed sheet for breathing rate detection.
  • Item
    Hybrid Modelling of Heterogeneous Volumetric Objects
    (Bournemouth University, 2021-04-22) Tereshin, Alexander
    Heterogeneous multi-material volumetric modelling is an emerging and rapidly developing field. A Heterogeneous object is a volumetric object with interior structure where different physically-based attributes are defined. The attributes can be of different nature: material distributions, density, microstructures, optical properties and others. Heterogeneous objects are widely used where the presence of the interior structures is an important part of the model. Computer-aided design (CAD), additive manufacturing, physical simulations, visual effects, medical visualisation and computer art are examples of such applications. In particular, digital fabrication employing multi-material 3D printing techniques is becoming omnipresent. However, the specific methods and tools for representation, modelling, rendering, animation and fabrication of multi-material volumetric objects with attributes are only starting to emerge. The need for adequate unifying theoretical and practical framework has been obvious. Developing adequate representational schemes for heterogeneous objects is in the core of research in this area. The most widely used representations for defining heterogeneous objects are boundary representation, distance-based representations, function representation and voxels. These representations work well for modelling homogeneous (solid) objects but they all have significant drawbacks when dealing with heterogeneous objects. In particular, boundary representation, while maintaining its prevailing role in computer graphics and geometric modelling, is not inherently natural for dealing with heterogeneous objects especially in the context of additive manufacturing and 3D printing, where multi-material properties are paramount as well as in physical simulation where the exact representation rather than an approximate one can be important. In this thesis, we introduce and systematically describe a theoretical and practical framework for modelling volumetric heterogeneous objects on the basis of a novel unifying functionally-based hybrid representation called HFRep. It is based on the function representation (FRep) and several distance-based representations, namely signed distance fields (SDFs), adaptively sampled distance fields (ADFs) and interior distance fields (IDFs). It embraces advantages and circumvents disadvantages of the initial representations. A mathematically substantiated theoretical description of the HFRep with an emphasis on defining functions for HFRep objects’ geometry and attributes is provided. This mathematical framework serves as the basis for developing efficient algorithms for the generation of HFRep objects taking into account both their geometry and attributes. To make the proposed approach practical, a detailed description of efficient algorithmic procedures has been developed. This has required employing a number of novel techniques of different nature, separately and in combination. In particular, an extension of a fast iterative method (FIM) for numerical solving of the eikonal equation on hierarchical grids was developed. This allowed for efficient computation of smooth distance-based attributes. To prove the concept, the main elements of the framework have been implemented and used in several applications of different nature. It was experimentally shown that the developed methods and tools can be used for generating objects with complex interior structure, e.g. microstructures, and different attributes. A special consideration has been devoted to applications of dynamic nature. A novel concept of heterogeneous space-time blending (HSTB) method with an automatic control for metamorphosis of heterogeneous objects with textures, both in 2D and 3D, has been introduced, algorithmised and implemented. We have applied the HSTB in the context of ‘4D Cubism’ project. There are plans to use the developed methods and tools for many other applications.
  • Item
    Machine Learning For Plausible Gesture Generation From Speech For Virtual Humans
    (Trinity College Dublin, The University of Dublin, 2021-08-03) Ferstl, Ylva
    The growing use of virtual humans in an array of applications such as games, human-computer interfaces, and virtual reality demands the design of appealing and engaging characters, while minimizing the cost and time of creation. Nonverbal behavior is an integral part of human communication and important for believable embodied virtual agents. Co-speech gesture represents a key aspect of nonverbal communication and virtual agents are more engaging when exhibiting gesture behavior. Hand-animation of gesture is costly and does not scale to applications where agents may produce new utterances after deployment. Automatized gesture generation is therefore attractive, enabling any new utterance to be animated on the go. A major body of research has been dedicated to methods of automatic gesture generation, but generating expressive and defined gesture motion has commonly relied on explicit formulation of if-then rules or probabilistic modelling of annotated features. Able to work on unlabelled data, machine learning approaches are catching up, however, they often still produce averaged motion failing to capture the speech-gesture relationship adequately. The results from machine-learned models point to the high complexity of the speech-to-motion learning task. In this work, we explore a number of machine learning methods for improving the speech-to-motion learning outcome, including the use of transfer learning from speech and motion models, adversarial training, as well as modelling explicit expressive gesture parameters from speech. We develop a method for automatically segmenting individual gestures from a motion stream, enabling detailed analysis of the speech-gesture relationship. We present two large multimodal datasets of conversational speech and motion, designed specifically for this modelling problem. We finally present and evaluate a novel speech-to-gesture system, merging methods of machine learning and database sampling.
  • Item
    Real-time Human Performance Capture and Synthesis
    (Saarländische Universitäts- und Landesbibliothek, 2021) Habermann, Marc Dr.-Ing.
    Most of the images one finds in the media, such as on the Internet or in textbooks and magazines, contain humans as the main point of attention. Thus, there is an inherent necessity for industry, society, and private persons to be able to thoroughly analyze and synthesize the human-related content in these images. One aspect of this analysis and subject of this thesis is to infer the 3D pose and surface deformation, using only visual information, which is also known as human performance capture. Human performance capture enables the tracking of virtual characters from real-world observations, and this is key for visual effects, games, VR, and AR, to name just a few application areas. However, traditional capture methods usually rely on expensive multi-view (marker-based) systems that are prohibitively expensive for the vast majority of people, or they use depth sensors, which are still not as common as single color cameras. Recently, some approaches have attempted to solve the task by assuming only a single RGB image is given. Nonetheless, they can either not track the dense deforming geometry of the human, such as the clothing layers, or they are far from real time, which is indispensable for many applications. To overcome these shortcomings, this thesis proposes two monocular human performance capture methods, which for the first time allow the real-time capture of the dense deforming geometry as well as an unseen 3D accuracy for pose and surface deformations. At the technical core, this work introduces novel GPU-based and data-parallel optimization strategies in conjunction with other algorithmic design choices that are all geared towards real-time performance at high accuracy. Moreover, this thesis presents a new weakly supervised multiview training strategy combined with a fully differentiable character representation that shows superior 3D accuracy. However, there is more to human-related Computer Vision than only the analysis of people in images. It is equally important to synthesize new images of humans in unseen poses and also from camera viewpoints that have not been observed in the real world. Such tools are essential for the movie industry because they, for example, allow the synthesis of photo-realistic virtual worlds with real-looking humans or of contents that are too dangerous for actors to perform on set. But also video conferencing and telepresence applications can benefit from photo-real 3D characters, as they can enhance the immersive experience of these applications. Here, the traditional Computer Graphics pipeline for rendering photo-realistic images involves many tedious and time-consuming steps that require expert knowledge and are far from real time. Traditional rendering involves character rigging and skinning, the modeling of the surface appearance properties, and physically based ray tracing. Recent learning-based methods attempt to simplify the traditional rendering pipeline and instead learn the rendering function from data resulting in methods that are easier accessible to non-experts. However, most of them model the synthesis task entirely in image space such that 3D consistency cannot be achieved, and/or they fail to model motion- and view-dependent appearance effects. To this end, this thesis presents a method and ongoing work on character synthesis, which allow the synthesis of controllable photoreal characters that achieve motion- and view-dependent appearance effects as well as 3D consistency and which run in real time. This is technically achieved by a novel coarse-to-fine geometric character representation for efficient synthesis, which can be solely supervised on multi-view imagery. Furthermore, this work shows how such a geometric representation can be combined with an implicit surface representation to boost synthesis and geometric quality.
  • Item
    Static inverse modelling of cloth
    (2021-09-28) Ly, Mickaël
    This thesis deals with the direct simulation and inverse design of garments in the presence of frictional contact. The shape of draped garments results from the slenderness of the fabric, which can be represented in mechanics by a thin elastic plate or shell, and from its interaction with the body through dry friction. This interaction, necessary to reproduce the threshold friction occuring in such contacts, is described by a non smooth law, which, in general, makes its integration complex. In a first contribution, we modify the so-called Projective Dynamics algorithm to incorporate this dry frictional contact law in a simple way. Projective Dynamics is a popular method in Computer Graphics that quickly simulates deformable objects such as plates with moderate accuracy, yet without including frictional contact. The rationale of this algorithm is to solve the integration of the dynamics by successively calculating estimates of the shape of the object at the next timestep. We take up the same idea to incorporate a procedure for estimating the frictional contact law that robustly captures the threshold phenomenon. In addition it is interesting to note that simulators developed in Computer Graphics, originally targeted at visual animation, have become increasingly accurate over the years. They are now being used in more "critical" applications such as architecture, robotics or medicine, which are more demanding in terms of accuracy. In collaboration with mechanicists and experimental physicists, we introduce into the Computer Graphics community protocols to verify the correctness of simulators, and we present in this manuscript our contributions related to plate and shell simulators. Finally, in a last part, we focus on garment inverse design. The interest of this process is twofold. Firstly, for simulations, solving the inverse problem provides a "force-free" and possibly curved version of the input (called the rest or natural shape), whether it comes from a 3D design or a 3D capture, that allows to start the simulation with the input as the initial deformed shape. To this end, we propose an algorithm for the inverse design of clothes represented by thin shells that also accounts for dry frictional contact. Within our framework, the input shape is considered to be a mechanical equilibrium subject to gravity and contact forces. Then our algorithm computes a rest shape such that this input shape can be simulated without any sagging. Secondly, it is also appealing to use these rest shapes for a real life application to manufacture the designed garments without sagging. However, the traditional cloth fabrication process is based on patterns, that is sets of flat panels sewn together. In this regard, we present in our more prospective part our results on the adaptation of the previous algorithm to include geometric constraints, namely surface developability, in order to get flattenable rest shapes.
  • Item
    Exploring Parameters of Virtual Character Lighting Through Perceptual Evaluation and Psychophysical Modelling
    (Trinity College Dublin, 2021-06-02) Wisessing, Pisut
    This thesis explored the parameters of virtual character lighting and their connections to the perceived emotion and appeal of the character. Our main interest is to empirically evaluate various common practices of setting up these parameters in traditional art forms, such as painting, theatre and cinematography, and their psychological effects on the perception of the character according to artistic conventions. We also aimed to standardise a general guideline for lighting design that will enhance the inner states of virtual avatars for maximum audience engagement. We conducted an extensive set of novel psychophysical experiments attempting to assess the links between the physical properties of lighting and the responses of the audience. The results were discussed in relation to theories found in the literature of visual perception, psychology and anthropology. We adapted classic research methodologies such as the multidimensional scaling analysis, the method of constant stimuli and the method of adjustment to the modern research question of how we perceive virtual characters and what makes them engaging for various applications, for example, self-avatars on social media platforms that drew massive interest from professional developers and casual makers alike. Some of our findings agreed and some disagreed with certain codes in cinematic lighting. Based on these newfound insights, we derived a set of lighting guidelines that can be used to enhance the emotion and appeal of digital characters and demonstrated a use case of a perceptual lighting tool. Moreover, our experiment designs, particularly the method of adjustment with real-time graphics, broke new ground for future research in virtual avatars. In summary, our contributions found applications in both industry practice and academic research.
  • Item
    Acquisition, Encoding and Rendering of Material Appearance Using Compact Neural Bidirectional Texture Functions
    (2021-11-23) Rainer, Gilles
    This thesis addresses the problem of photo-realistic rendering of real-world materials. Currently the most faithful approach to render an existing material is scanning the Bidirectional Reflectance Function (BTF), which relies on exhaustive acquisition of reflectance data from the material sample. This incurs heavy costs in terms of both capture times and memory requirements, meaning the main drawback is the lack of practicability. The scope of this thesis is two-fold: implementation of a full BTF pipeline (data acquisition, processing and rendering) and design of a compact neural material representation. We first present our custom BTF scanner, which uses a freely positionable camera and light source to acquire light- and view-dependent textures. During the processing phase, the textures are extracted from the images and rectified onto a unique grid using an estimated proxy surface. At rendering time, the rectification is reverted and the estimated height field additionally allows the preservation of material silhouettes. The main part of the thesis is the development of a neural BTF model that is both compact in memory and practical for rendering. Concretely, the material is modeled by a small fully-connected neural network, parametrized on light and view directions as well as a vector of latent parameters that describe the appearance of the point. We first show that one network can efficiently learn to reproduce the appearance of one given material. The second focus of our work is to find an efficient method to translate BTFs into our representation. Rather than training a new network instance for each new material, the latent space and network are shared, and we use an encoder network to quickly predict latent parameter networks for new, unseen materials. All contributions are geared towards making photo-realistic rendering with BTFs more common and practicable in computer graphics applications like games and virtual environments.
  • Item
    Data-Driven Face Analysis for Performance Retargeting
    (ETH Zurich, 2022-05-25) Zoss, Gaspard
    The democratization of digital humans in entertainment was made possible by the recent advances in performance capture, rendering and animation techniques. The human face, which is key to realism, is very complex to animate by hand and facial performance capture is nowadays often used to acquire a starting point for the animation. Most of the time however, captured actors are not re-rendered directly on screen, but their performance is retargeted to other characters or fantasy creatures. The task of retargeting facial performances brings forth multiple challenging questions, such as how does one map the performance of an actor to another, how should the data be represented to optimally do so, and how does one maintain artistic control while doing so, to only cite a few. These challenges make facial performance retargeting an active and exciting area of research. In this dissertation, we present several contributions towards solving the retargeting problem. We first introduce a novel jaw rig, designed using ground truth jaw motion data acquired with a novel capture method specifically designed for this task. Our jaw rig allows for direct and indirect controls while restricting the motion of the mandible to only physiologically possible poses. We use a well-known concept from dentistry, the Posselt envelope of motion, to parameterize its controls. Finally, we show how this jaw rig can be retargeted to unseen actors or creatures. Our second contribution is a novel markerless method to accurately track the underlying jaw bone. We use our jaw motion capture method to capture a dataset of ground truth jaw motion and geometry and learn a non-linear mapping between the facial skin deformation and the motion of the underlying bone. We also demonstrate how this method can be used on actors for which no ground truth jaw motion is acquired, outperforming the currently used techniques. In most of the modern performance capture methods, the facial geometry will inevitably contain parasitic dynamic motion which are, most of the time, undesired. This is specially true in the context of performance retargeting. Our third contribution aims to compute and characterize the difference between the captured dynamic facial performance, and a speculative quasistatic variant of the same motion, should the inertial effects have been absent. We show how our method can be used to remove secondary dynamics from a captured performance and synthesize novel dynamics, given novel head motion. Our last contribution tackles a different kind of retargeting problem; the problem of re-aging of facial performances in image space. In contrast to existing method, we specifically tackle the problem of high-resolution temporally stable re-aging. We show how a synthetic dataset can be computed using a state-of-the-art generative adversarial network and used to train our re-aging network. Our method allows fine-grained continuous age control and intuitive artistic effects such as localized control. We believe the methods presented in this thesis will solve or alleviate some of the problems in modern performance retargeting and will inspire exciting future work.
  • Item
    A Search for Shape
    (Norwegian University of Science and Technology, 2021-12-03) van Blokland, Bart Iver
    As 3D object collections grow, searching based on shape becomes crucial. 3D capturing has seen a rise in popularity over the past decade and is currently being adopted in consumer mobile hardware such as smartphones and tablets, thus increasing the accessibility of this technology and by extension the volume of 3D scans. New applications based on large 3D object collections are expected to become commonplace and will require 3D object retrieval similar to image based search available in current search engines. The work documented in this thesis consists of three primary contributions. The first one is the RICI and QUICCI local 3D shape descriptors, which use the novel idea of intersection counts for shape description. They are shown to be highly resistant to clutter and capable of effectively utilising the GPU for efficient generation and comparison of descriptors. Advantages of these descriptors over the previous state of the art include speed, size, descriptiveness and resistance to clutter, which is shown by a new proposed benchmark. The second primary contribution consists of two indexing schemes, the Hamming tree and the Dissimilarity tree. They are capable of indexing and retrieving binary descriptors (such as the QUICCI descriptor) and respectively use the Hamming and proposed Weighted Hamming distance functions efficiently. The Dissimilarity tree in particular is capable of retrieving nearest neighbour descriptors even when their Hamming distance is large, an aspect where previous approaches tend to scale poorly. The third major contribution is achieved by combining the proposed QUICCI descriptor and Dissimilarity tree into a complete pipeline for partial 3D object retrieval. The method takes a collection of complete objects, which are indexed using the dissimilarity tree and can subsequently efficiently retrieve objects that are similar to a partial query object. Thus, it is shown that local descriptors based on shape intersection counts can be applied effectively on tasks such as clutter resistant matching and partial 3D shape retrieval. Highly efficient GPU implementations of the proposed, as well as several popular descriptors, have been made publicly available to the research community and may assist with further developments in the field.
  • Item
    Computational Analysis and Design of Structurally Stable Assemblies with Rigid Parts
    (Lausanne, EPFL, 2021-12-10) Wang, Ziqi
    An assembly refers to a collection of parts joined together to achieve a specific form and/or functionality. Assemblies make it possible to fabricate large and complex objects with several small and simple parts. Such parts can be assembled and disassembled repeatedly, benefiting the transportation and maintenance of the assembly. Due to these advantages, assemblies are ubiquitous in our daily lives, including most furniture, household appliances, and architecture. The recent advancement in digital fabrication lowers the hurdles for fabricating objects with complex shapes. However, designing physically plausible assemblies is still a non-trivial task as a slight local modification on a part's geometry could have a global impact on the structural and/or functional performance of the whole assembly. New computational tools are developed to enable general users involved in the design process exploiting their imagination. This thesis focuses on static assemblies with rigid parts. We develop computational methods for analyzing and designing assemblies that are structurally stable and assemblable. To address this problem, we use integral joints i.e., tenon and mortise, that are historically used because of their reversibility which simplifies the disassembly process significantly. Properly arranged integral joints can restrict parts' relative movement for improved structural stability. However, manually finding the right joints' geometry is a tedious and error-prone task. Inspired by the kinematic-static duality, we first propose a new kinematic-based method for analyzing the structural stability of assemblies. We then develop a two-stage computational design framework based on this new analyzing method. The kinematic design stage determines the amount of motion restrictions imposed by joints to make a given assembly stable in the motion space. The geometric design stage searches for proper shapes of the joints to satisfy the motion restriction requirements computed from the previous stage. To solve the problem numerically, we propose the joint motion cones to measure the motion restriction capacity of given joints. Compared with previous works, our framework can efficiently handle inputs with complex geometry. Besides, our design framework is very flexible and can easily be adapted to various applications: First, we focus on designing globally interlocking assemblies that can withstand arbitrary external forces and torques. Second, we are interested in designing assemblies of rigid convex blocks to approximate freeform surfaces. Our design framework can optimize the blocks' shape to generate assemblies with good resistance against lateral forces, and in some cases, globally interlocking assemblies. Lastly, we present a method for designing complex assemblies with cone joints. By optimizing the shapes of cone joints, our design framework can find the best trade-off between structural stability and assemblability. We validate our computational tools by fabricating a series of physical prototypes. Our algorithms have great potential to be applied for solving various assembly design problems ranging from small-scale such as toys and furniture to large-scale such as art installation and architecture. For example, the proposed techniques could be applied for designing discrete architecture that can be automatically constructed with robots.
  • Item
    Concepts and methods to support the development and evaluation of remote collaboration using augmented reality
    (University of Aveiro, 2021) Marques, Bernardo
    Remote Collaboration using Augmented Reality (AR) shows great potential to establish a common ground in physically distributed scenarios where team-members need to achieve a shared goal. However, most research efforts in this field have been devoted to experiment with the enabling technology and propose methods to support its development. As the field evolves, evaluation and characterization of the collaborative process become an essential, but difficult endeavor, to better understand the contributions of AR. In this thesis, we conducted a critical analysis to identify the main limitations and opportunities of the field, while situating its maturity and proposing a roadmap of important research actions. Next, a human-centered design methodology was adopted, involving industrial partners to probe how AR could support their needs during remote maintenance. These outcomes were combined with literature methods into an AR-prototype and its evaluation was performed through a user study. From this, it became clear the necessity to perform a deep reflection in order to better understand the dimensions that influence and must/should be considered in Collaborative AR. Hence, a conceptual model and a human-centered taxonomy were proposed to foster systematization of perspectives. Based on the model proposed, an evaluation framework for contextualized data gathering and analysis was developed, allowing support the design and performance of distributed evaluations in a more informed and complete manner. To instantiate this vision, the CAPTURE toolkit was created, providing an additional perspective based on selected dimensions of collaboration and pre-defined measurements to obtain “in situ” data about them, which can be analyzed using an integrated visualization dashboard. The toolkit successfully supported evaluations of several team-members during tasks of remote maintenance mediated by AR. Thus, showing its versatility and potential in eliciting a comprehensive characterization of the added value of AR in real-life situations, establishing itself as a general-purpose solution, potentially applicable to a wider range of collaborative scenarios.