2020

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/2632877

Browse

Now showing 1 - 20 of 20

Authoring consistent, animated ecosystems : Efficient learning from partial data
(2020-12-03) Ecormier-Nocca, Pierre;
With recent increases in computing power, virtual worlds are now larger and more complex than ever. As such content becomes widespread in many different media, the expectation of realism has also dramatically increased for the end user. As a result, a large body of work has been accomplished on the modeling and generation of terrains and vegetation, sometimes also considering their interactions. However, animals have received far less attention and are often considered in isolation. Along with a lack of authoring tools, this makes the modeling of ecosystems difficult for artists, who are either limited in their creative freedom or forced to break biological realism.In this thesis, we present new methods suited to the authoring of ecosystems, allowing creative freedom without discarding biological realism. We provide data-centered tools for an efficient authoring, while keeping a low data requirement. By taking advantage of existing biology knowledge, we are able to guarantee both the consistency and quality of the results. We present dedicated methods for precise and intuitive instantiation of static and animated elements. Since static elements, such as vegetation, can exhibit complex interactions, we propose an accurate example-based method to synthesize complex and potentially overlapping arrangements. We apply a similar concept to the authoring of herds of animals, by using photographs or videos as input for example-based synthesis. At a larger scale, we use biological data to formulate a unified pipeline handling the global instantiation and long-term interactions of vegetation and animals. While this model enforces biological consistency, we also provide control by allowing manual editing of the data at any stage of the process.Our methods provide both user control and realism over the entire ecosystem creation pipeline, covering static and dynamic elements, as well as interactions between themselves and their environment. We also cover different scales, from individual placement and movement of elements to management of the entire ecosystem. We validate our results with user studies and comparisons with both real and expert data.
Compendium of Publications on: Differential Operators on Manifolds for CAD CAM CAE and Computer Graphics
(EAFIT University, 2020-05-20) Mejia-Parra, Daniel;
This Doctoral Thesis develops novel articulations of Differential Operators on Manifolds for applications on Computer Aided Design, Manufacture and Computer Graphics, as follows: (1) Mesh Parameterization and Segmentation. Development and application of Laplace-Beltrami, Hessian, Geodesic and Curvature operators for topology and geometry – driven segmentations and parameterizations of 2-manifold triangular meshes. Applications in Reverse Engineering, Manufacturing and Medicine. (2) Computing of Laser-driven Temperature Maps in thin plates. Spectral domain - based analytic solutions of the transient, non-homogeneous heat equation for simulation of temperature maps in multi-laser heated thin plates, modeled as 2-manifolds plus thickness. (3) Real-time estimation of dimensional compliance of hot out-of-forge workpieces. A Special Orthogonal SO(3) transformation between 2-manifolds is found, which enables a distance operator between 2-manifolds in R^3 (or m-manifolds in R^n). This process instruments the real-time assessment of dimensional compliance of hot workpieces, in the factory floor shop. (4) Slicing or Level-Set computation for 2-manifold triangular meshes in Additive Manufacturing. Development of a classification of non-degenerate (i.e. non-singular Hessian) and degenerate (i.e. singular Hessian) critical points of non-Morse functions on 2-manifold objects, followed by computation of level sets for Additive Manufacturing. Most of the aforementioned contributions have been screened and accepted by the international scientific community (and published). Non-published material corresponds to confidential developments which are commercially exploited by the sponsors and therefore banned from dissemination.
Computational design of curved thin shells: from glass façades to programmable matter
(IST Austria, 2020-09-21) Guseinov, Ruslan;
Fabrication of curved shells plays an important role in modern design, industry, and science. Among their remarkable properties are, for example, aesthetics of organic shapes, ability to evenly distribute loads, or efficient flow separation. They find applications across vast length scales ranging from sky-scraper architecture to microscopic devices. But, at the same time, the design of curved shells and their manufacturing process pose a variety of challenges. In this thesis, they are addressed from several perspectives. In particular, this thesis presents approaches based on the transformation of initially flat sheets into the target curved surfaces. This involves problems of interactive design of shells with nontrivial mechanical constraints, inverse design of complex structural materials, and data-driven modeling of delicate and time-dependent physical properties. At the same time, two newly-developed self-morphing mechanisms targeting flat-to-curved transformation are presented. In architecture, doubly curved surfaces can be realized as cold bent glass panelizations. Originally flat glass panels are bent into frames and remain stressed. This is a cost-efficient fabrication approach compared to hot bending, when glass panels are shaped plastically. However such constructions are prone to breaking during bending, and it is highly nontrivial to navigate the design space, keeping the panels fabricable and aesthetically pleasing at the same time. We introduce an interactive design system for cold bent glass façades, while previously even offline optimization for such scenarios has not been sufficiently developed. Our method is based on a deep learning approach providing quick and high precision estimation of glass panel shape and stress while handling the shape multimodality. Fabrication of smaller objects of scales below 1 m, can also greatly benefit from shaping originally flat sheets. In this respect, we designed new self-morphing shell mechanisms transforming from an initial flat state to a doubly curved state with high precision and detail. Our so-called CurveUps demonstrate the encodement of the geometric information into the shell. Furthermore, we explored the frontiers of programmable materials and showed how temporal information can additionally be encoded into a flat shell. This allows prescribing deformation sequences for doubly curved surfaces and, thus, facilitates self-collision avoidance enabling complex shapes and functionalities otherwise impossible. Both of these methods include inverse design tools keeping the user in the design loop.
Face Morphing and Morphing Attack Detection
(2020-11-16) Scherhag, Ulrich Johannes;
In modern society, biometrics is gaining more and more importance, driven by the increase in recognition performance of the systems. In some areas, such as automatic border controls, there is no alternative to the application of biometric systems. Despite all the advantages of biometric systems, the vulnerability of these still poses a problem. Facial recognition systems for example offer various attack points, like faces printed on paper or silicone masks. Besides the long known and well researched presentation attacks there is also the danger of the so-called morphing attack. The research field of morphing attacks is quite young, which is why it has only been investigated to a limited extent so far. Publications proposing algorithms for the detection of morphing attacks often lack uniform databases and evaluation methods, which leads to a restricted comparability of the previously published work. Thus, the focus of this thesis is the comprehensive analysis of different features and classifiers in their suitability as algorithms for the detection of morphing attacks. In this context, evaluations are performed with uniform metrics on a realistic morphing database, allowing the simulation of various realistic scenarios. If only the suspected morph is available, a HOG feature extraction in combination with an SVM is able to detect morphs with a D-EER ranging from 13.25% to 24.05%. If a trusted live capture image is available in addition, for example from a border gate, the deep ArcFace features in combination with an SVM can detect morphs with a D-EER ranging from 2.71% to 7.17%.
From Neurons to Behavior: Visual Analytics Methods for Heterogeneous Spatial Big Brain Data
(2019-09) Florian Johann Ganglberger;
Advances in neuro-imaging have allowed big brain initiatives and consortia to create vast resources of brain data that can be mined for insights into mental processes and biological principles. Research in this area does not only relate to mind and consciousness, but also to the understanding of many neurological disorders, such as Alzheimers disease, autism, and anxiety. Exploring the relationships between genes, brain circuitry, and behavior is therefore a key element in research that requires the joint analysis of a heterogeneous set of spatial brain data, including 3D imaging data, anatomical data, and brain networks at varying scales, resolutions, and modalities. Due to high-throughput imaging platforms, this datas size and complexity goes beyond the state-of-the-art by several orders of magnitude. Current analytical workflows involve time-consuming manual data aggregation and extensive computational analysis in script-based toolboxes. Visual analytics methods for exploring big brain data can support neuroscientists in this process, so they can focus on understanding the data rather than handling it. In this thesis, several contributions that target this problem are presented. The first contribution is a computational method that fuses genetic information with spatial gene expression data and connectivity data to predict functional neuroanatomical maps. These maps indicate, which brain areas might be related to a specific function or behavior. The approach has been applied to predict yet unknown functional neuroanatomy underlying multigeneic behavioral traits identified in genetic association studies and has demonstrated that rather than being randomly distributed throughout the brain, functionally-related gene sets accumulate in specific networks. The second contribution is the creation of a data structure that enables the interactive exploration of big brain network data with billions of edges. By utilizing the resulting hierarchical and spatial organization of the data, this approach allows neuroscientists on-demand queries of incoming/outgoing connections of arbitrary regions of interest on different anatomical scales. These queries would otherwise exceed the limits of current consumer level PCs. The data structure is used in the third contribution, a novel web-based framework to explore neurobiological imaging and connectivity data of different types, modalities, and scale. It employs a query-based interaction scheme to retrieve 3D spatial gene expressions and various types of connectivity to enable an interactive dissection of networks in real-time with respect to their genetic composition. The data is related to a hierarchical organization of common anatomical atlases that enables neuroscientists to compare multimodal networks on different scales in their anatomical context. Furthermore, the framework is designed to facilitate collaborative work with shareable comprehensive workflows on the web. As a result, the approaches presented in this thesis may assist neuroscientists to refine their understanding of the functional organization of the brain beyond simple anatomical domains and expand their knowledge about how our genes affect our mind.
Interactive Freeform Architectural Design with Nearly Developables and Cold Bent Glass
(TU Wien, 2020-09-18) Gavriil, Konstantinos;
Interactive design of freeform architectural surface panelizations is at the coreof this PhD thesis. We provide the computational framework for dealing with two important types of paneling elements. Specifically, we focus on certain types of developable surfaces and cold bent glass panels, all relevant to contemporary freeform architecture.To this end, we initially present a novel method for increasing the developabilityof a B-spline surface. We use the property that the Gauss image of a developable surfaceis 1-dimensional and can be locally well approximated by circles. This is cast intoan algorithm for thinning the Gauss image by increasing the planarity of the Gaussimages of appropriate neighborhoods. A variation of the main method allows us totackle the problem of paneling a freeform architectural surface with developable panels,in particular enforcing rotational cylindrical, rotational conical and planar panels,which are the main preferred types of developable panels in architecture due to there duced cost of manufacturing. We are interested in near developability, rather than exact developability, so the optimization approach is sucient. The motivation behind this is the fact that most materials allow for a little bit of stretching and therefore developability needs not be satised to a high degree.One such material is glass which is the main focus of the second panelizationproblem of this thesis. Toughened glass can with stand higher stresses, and therefore allows initially planar glass panels to be elastically bent and xed at ambient temperatures to a curved frame. This process is called cold bending and it produces panels that can exhibit double curvature, providing a cost- and energy-ecient alternative of higher optical quality than traditional hot bent glass panels. However, it is very challenging to navigate the design space of cold bent glass panels due to the fragility of the material, which impedes the form-nding for practically feasible and aesthetically pleasing cold bent glass façades. We present an interactive, data-driven approachfor designing cold bent glass façades that can be seamlessly integrated into a typical architectural design pipeline. Our method allows non-expert users to interactively edit a parametric surface while providing real-time feedback on the deformed shape and maximum stress of cold bent glass panels. Designs are automatically rened to minimize several fairness criteria while maximal stresses are kept within glass limits.We achieve interactive frame rates by using a dierentiable mixture density network trained from more than a million simulations. Given a curved boundary, our regressionmodel is capable of handling multistable congurations and accurately predicting the equilibrium shape of the panel and its corresponding maximal stress. We show predictions are highly accurate and validate our results with a physical realization ofa cold bent glass surface. For both applications explored in this work, a plethora ofresults and examples are provided.
Interactive Visualization of Simulation Data for Geospatial Decision Support
(Technische Universität Wien, Vienna, Austria, 2020-02-12) Cornel, Daniel;
Floods are catastrophic events that claim thousands of human lives every year. For the prediction of these events, interactive decision support systems with integrated flood simulation have become a vital tool. Recent technological advances made it possible to simulate flooding scenarios of unprecedented scale and resolution, resulting in very large time-dependent data. The amount of simulation data is further amplified by the use of ensemble simulations to make predictions more robust, yielding high-dimensional and uncertain data far too large for manual exploration. New strategies are therefore needed to filter these data and to display only the most important information to support domain experts in their daily work. This includes the communication of results to decision makers, emergency services, stakeholders, and the general public. A modern decision support system has to be able to provide visual results that are useful for domain experts, but also comprehensible for larger audiences. Furthermore, for an efficient workflow, the entire process of simulation, analysis, and visualization has to happen in an interactive fashion, putting serious time constraints on the system. In this thesis, we present novel visualization techniques for time-dependent and uncertain flood, logistics, and pedestrian simulation data for an interactive decision support system. As the heterogeneous tasks in flood management require very diverse visualizations for different target audiences, we provide solutions to key tasks in the form of task-specific and user-specific visualizations. This allows the user to show or hide detailed information on demand to obtain comprehensible and aesthetic visualizations to support the task at hand. In order to identify the impact of flooding incidents on a building of interest, only a small subset of all available data is relevant, which is why we propose a solution to isolate this information from the massive simulation data. To communicate the inherent uncertainty of resulting predictions of damages and hazards, we introduce a consistent style for visualizing the uncertainty within the geospatial context. Instead of directly showing simulation data in a time-dependent manner, we propose the use of bidirectional flow maps with multiple components as a simplified representation of arbitrary material flows. For the communication of flood risks in a comprehensible way, however, the direct visualization of simulation data over time can be desired. Apart from the obvious challenges of the complex simulation data, the discrete nature of the data introduces additional problems for the realistic visualization of water surfaces, for which we propose robust solutions suitable for real-time applications. All of our findings have been acquired through a continuous collaboration with domain experts from several flood-related fields of work. The thorough evaluation of our work by these experts confirms the relevance and usefulness of our presented solutions.
Learning-based face reconstruction and editing
(Max Planck Institute for Informatics, 2020-09-29) Kim, Hyeongwoo;
Photo-realistic face editing – an important basis for a wide range of applications in movie and game productions, and applications for mobile devices – is based on computationally expensive algorithms that often require many tedious time-consuming manual steps. This thesis advances state-of-the-art face performance capture and editing pipelines by proposing machine learning-based algorithms for high-quality inverse face rendering in real time and highly realistic neural face rendering, and a video-based refocusing method for faces and general videos. In particular, the proposed contributions address fundamental open challenges towards real-time and highly realistic face editing. The first contribution addresses face reconstruction and introduces a deep convolutional inverse rendering framework that jointly estimates all facial rendering parameters from a single image in real time. The proposed method is based on a novel boosting process that iteratively updates the synthetic training data to better reflect the distribution of real-world images. Second, the thesis introduces a method for face video editing at previously unseen quality. It is based on a generative neural network with a novel space-time architecture, which enables photo-realistic re-animation of portrait videos using an input video. It is the first method to transfer the full 3D head position, head rotation, face expression, eye gaze and eye blinking from a source actor to a portrait video of a target actor. Third, the thesis contributes a new refocusing approach for faces and general videos in postprocessing. The proposed algorithm is based on a new depth-from-defocus algorithm that computes space-time-coherent depth maps, deblurred all-in-focus video and the focus distance for each frame. The high-quality results shown with various applications and challenging scenarios demonstrate the contributions presented in the thesis, and also show potential for machine learning-driven algorithms to solve various open problems in computer graphics.
Live Inverse Rendering
(SciDok - Der Wissenschaftsserver der Universität des Saarlandes, 2020-02-03) Meka, Abhimitra;
The field of computer graphics is being transformed by the process of ‘personalization’. The advent of augmented and mixed reality technology is challenging the existing graphics systems, which traditionally required elaborate hardware and skilled artistic efforts. Now, photorealistic graphics are require to be rendered on mobile devices with minimal sensors and compute power, and integrated with the real world environment automatically. Seamlessly integrating graphics into real environments requires the estimation of the fundamental light transport components of a scene - geometry, reflectance and illumination. While estimating environmental geometry and self-localization on mobile devices has progressed rapidly, the task of estimating scene reflectance and illumination from monocular images or videos in real-time (termed live inverse rendering) is still at a nascent stage. The challenge is that of designing efficient representations and models for these appearance parameters and solving the resulting high-dimensional, non-linear and under-constrained system of equations at frame rate. This thesis comprehensively explores, for the first time, various representations, formulations, algorithms and systems for addressing these challenges in monocular inverse rendering. Starting with simple assumptions on the light transport model – of Lambertian surface reflectance and single light bounce scenario – the thesis expands in various directions by including 3D geometry, multiple light bounces, non-Lambertian isotropic surface reflectance and data-driven reflectance representation to address various facets of this problem. In the first part, the thesis explores the design of fast parallel non-linear GPU optimization schemes for solving both sparse and dense set of equations underlying the inverse rendering problem. In the next part, it applies the current advances in machine learning methods to design novel formulations and loss-energies to give a significant push to the stateof-the-art of reflectance and illumination estimation. Several real-time applications of illumination-aware scene editing, including relighting and material-cloning, are also shown to be made possible for first time by the new models proposed in this thesis. Finally, an outlook for future work on this problem is laid out, with particular emphasis on the interesting new opportunities afforded by the recent advances in machine learning.
Modeling Developable Surfaces with Discrete Orthogonal Geodesic Nets
(2020-02) Rabinovich, Michael;
Surfaces that are locally isometric to a plane are called developable surfaces. In the physical world, these surfaces can be formed by bending thin flat sheets of material, which makes them particularly attractive in manufacturing, architecture and art. Consequently, the design of freeform developable surfaces has been an active research topic in computer graphics, computer aided design, architectural geometry and computational origami for several decades. This thesis presents a discrete theory and a set of computational tools for modeling developable surfaces. The basis of our theory is a discrete model termed discrete orthogonal geodesic nets (DOGs). DOGs are regular quadrilateral meshes satisfying local angle constraints, extending the rich theory of nets in discrete differential geometry. Our model is simple, local, and, unlike previous works, it does not directly encode the surface rulings. Thus, DOGs can be used to model continuous deformations of developable surfaces independently of their decomposition into torsal and planar patches or the surface topology. We start by examining the “locking” phenomena common in computational models for developable surfaces, which was the primary motivation behind our work. We then follow up with the derivation and definitions behind our solution - DOGs, while theoretically and empirically demonstrating the connection between our model and its smooth counterpart and its resilience to the locking problem. We prove that every sampling of the smooth counterpart satisfies our constraints up to second order and establish connections between DOGs and other nets in discrete differential geometry. We then develop a theoretical and computational framework for deforming DOGs. We first derive a variety of geometric attributes on DOGs, including notions of normals, curvatures, and a novel DOG Laplacian operator. These can be used as objectives for various modeling tasks. By utilizing the regular nature of our model, our discrete quantities are simple yet precise, and we discuss their convergence. We then study the DOG constraints, via looking at continous deformations on DOGs. We characterizae the shape space of DOGs for a given net connectivity. We show that generally, this space is locally a manifold of a fixed dimension, apart from a set of singularities, implying that DOGs are continuously deformable. Smooth flows can be constructed by a smooth choice of vectors on the manifold’s tangent spaces, selected to minimize a desired objective function under a given metric. The study of the shape space leads to a better understanding of the flexibility and rigidity of DOGs, and we devote an entire chapter to examining various notions of isometries on DOGs and a novel model termed ”discrete orthogonal 4Q geodesic net”. We further show how to extend the shape space of DOGs by supporting creases and curved folds. We derive a discrete binary characterization for folds between discrete developable surfaces, accompanied by an algorithm to simultaneously fold creases and smoothly bend planar sheets. We complement our algorithm with essential building blocks for curved folding deformations: objectives to control dihedral angles and mountain-valley assignments. We apply our theory and resulting set of tools in the first interactive editing system for developable surfaces that supports arbitrary bending, stretching, cutting, (curved) folds, as well as smoothing and subdivision operations.
Multi-view image-based editing and rendering through deep learning and optimization
(2020-09-25) Philip, Julien;
Computer-generated imagery (CGI) takes a growing place in our everyday environment. Whether it is in video games or movies, CGI techniques are constantly improving in quality but also require ever more qualitative artistic content which takes a growing time to create. With the emergence of virtual and augmented reality, often comes the need to render or re-render assets that exist in our world. To allow widespread use of CGI in applications such as telepresence or virtual visits, the need for manual artistic replication of assets must be removed from the process. This can be done with the help of Image-Based Rendering (IBR) techniques that allow scenes or objects to be rendered in a free-viewpoint manner from a set of sparse input photographs. While this process requires little to no artistic work, it also does not allow for artistic control or editing of scene content. In this dissertation, we explore Multi-view Image Editing and Rendering. To allow casually captured scenes to be rendered with content alterations such as object removal, lighting edition, or scene compositing, we leverage the use of optimization techniques and modern deep-learning. We design our methods to take advantage of all the information present in multi-view content while handling specific constraints such as multi-view coherency. For object removal, we introduce a new plane-based multi-view inpainting algorithm. Planes are a simple yet effective way to fill geometry and they naturally enforce multi-view coherency as inpainting is computed in a shared rectified texture space, allowing us to correctly respect perspective. We demonstrate instance-based object removal at the scale of a street in scenes composed of several hundreds of images. We next address outdoor relighting with a learning-based algorithm that efficiently allows the illumination in a scene to be changed, while removing and synthesizing cast shadows for any given sun position and accounting for global illumination. An approximate geometric proxy built using multi-view stereo is used to generate illumination and shadow related image buffers that guide a neural network. We train this network on a set of synthetic scenes allowing full supervision of the learning pipeline. Careful data augmentation allows our network to transfer to real scenes and provides state of the art relighting results. We also demonstrate the capacity of this network to be used to compose real scenes captured under different lighting conditions and orientation. We then present contributions to image-based rendering quality. We discuss how our carefully designed depth-map meshing and simplification algorithm improve rendering performance and quality of a new learning-based IBR method. Finally, we present a method that combines relighting, IBR, and material analysis. To enable relightable IBR with accurate glossy effects, we extract both material appearance variations and qualitative texture information from multi-view content in the form of several IBR heuristics. We further combine them with path-traced irradiance images that specify the input and target lighting. This combination allows a neural network to be trained to implicitly extract material properties and produce realistic-looking relit viewpoints. Separating diffuse and specular supervision is crucial in obtaining high-quality output.
Perception-Aware Computational Fabrication: Increasing The Apparent Gamut of Digital Fabrication
(Università della Svizzera italiana, 2020-10-19) Piovarci, Michal;
Haptic and visual feedback are important for assessing objects' quality and affordance. One of the benefits of additive manufacturing is that it enables the creation of objects with personalized tactile and visual properties. This personalization is realized by the ability to deposit functionally graded materials at microscopic resolution. However, faithfully reproducing real-world objects on a 3D printer is a challenging endeavor. A large number of available materials and freedom in material deposition make exploring the space of printable objects difficult. Furthermore, current 3D printers can perfectly capture only a small amount of objects from the real world which makes high-quality reproductions challenging. Interestingly, similar to the manufacturing hardware, our senses of touch and sight have inborn limitations given by biological constraints. In this work, we propose that it is possible to leverage the limitations of human perception to increase the apparent gamut of a 3D printer by combining numerical optimization with perceptual insights. Instead of optimizing for exact replicas, we search for perceptually equivalent solutions. This not only simplifies the optimization but also achieves prints that better resemble the target behavior. To guide us towards the desired behavior, we design perceptual error metrics. Recovering such a metric requires conducting costly experiments. We tackle this problem by proposing a likelihood-based optimization that automatically recovers a metric that relates perception with physical properties. To minimize fabrication during the optimization we map new designs into perception via numerical models. As with many complex design tasks modeling the governing physics is either computationally expensive or we lack predictive models. We address this issue by applying perception-aware coarsening that focuses the computation towards perceptually relevant phenomena. Additionally, we propose a data-driven fabrication-in-the-loop model that implicitly handles the fabrication constraints. We demonstrate the capabilities of our approach in the contexts of haptic and appearance reproduction. We show its applications to designing objects with prescribed compliance, and mimicking the haptics of drawing tools. Furthermore, we propose a system for manufacturing objects with spatially-varying gloss.
Preoperative Surgical Planning
(2020-04-29) Fauser, Johannes Ludwig;
Since several decades, minimally-invasive surgery has continuously improved both clinical workflow and outcome. Such procedures minimize patient trauma, decrease hospital stay or reduce risk of infection. Next generation robot-assisted interventions promise to further improve on these advantages while at the same time opening the way to new surgical applications. Temporal Bone Surgery and Endovascular Aortic Repair are two examples for such currently researched approaches, where manual insertion of instruments, subject to a clinician's experience and daily performance, could be replaced by a robotic procedure. In the first, a flexible robot would drill a nonlinear canal through the mastoid, allowing a surgeon access to the temporal bone's apex, a target often unreachable without damaging critical risk structures. For the second example, robotically driven guidewires could significantly reduce the radiation exposure from fluoroscopy, that is exposed to patients and surgeons during navigation through the aorta. These robot-assisted surgeries require preoperative planning consisting of segmentation of risk structures and computation of nonlinear trajectories for the instruments. While surgeons could so far rely on preoperative images and a mental 3D model of the anatomy, these new procedures will make computational assistance inevitable due to the added complexity from image processing and motion planning. The automation of tiresome and manually laborious tasks is therefore crucial for successful clinical implementation. This thesis addresses these issues and presents a preoperative pipeline based on CT images that automates segmentation and trajectory planning. Major contributions include an automatic shape regularized segmentation approach for coherent anatomy extraction as well as an exhaustive trajectory planning step on locally optimized Bézier Splines. It also introduces thorough in silico experiments that perform functional evaluation on real and synthetically enlarged datasets. The benefits of the approach are shown on an in house dataset of otobasis CT scans as well as on two publicly available datasets containing aorta and heart.
Real-time 3D Hand Reconstruction in Challenging Scenes from a Single Color or Depth Camera
(2020) Müller, Franziska;
Hands are one of the main enabling factors for performing complex tasks and humans naturally use them for interactions with their environment. Reconstruction and digitization of 3D hand motion opens up many possibilities for important applications. Hands gestures can be directly used for human–computer interaction, which is especially relevant for controlling augmented or virtual reality (AR/VR) devices where immersion is of utmost importance. In addition, 3D hand motion capture is a precondition for automatic sign-language translation, activity recognition, or teaching robots. Different approaches for 3D hand motion capture have been actively researched in the past. While being accurate, gloves and markers are intrusive and uncomfortable to wear. Hence, markerless hand reconstruction based on cameras is desirable. Multi-camera setups provide rich input, however, they are hard to calibrate and lack the flexibility for mobile use cases. Thus, the majority of more recent methods uses a single color or depth camera which, however, makes the problem harder due to more ambiguities in the input. For interaction purposes, users need continuous control and immediate feedback. This means the algorithms have to run in real time and be robust in uncontrolled scenes. These requirements, achieving 3D hand reconstruction in real time from a single camera in general scenes, make the problem significantly more challenging. While recent research has shown promising results, current state-of-the-art methods still have strong limitations. Most approaches only track the motion of a single hand in isolation and do not take background-clutter or interactions with arbitrary objects or the other hand into account. The few methods that can handle more general and natural scenarios run far from real time or use complex multi-camera setups. Such requirements make existing methods unusable for many aforementioned applications. This thesis pushes the state of the art for real-time 3D hand tracking and reconstruction in general scenes from a single RGB or depth camera. The presented approaches explore novel combinations of generative hand models, which have been used successfully in the computer vision and graphics community for decades, and powerful cutting-edge machine learning techniques, which have recently emerged with the advent of deep learning. In particular, this thesis proposes a novel method for hand tracking in the presence of strong occlusions and clutter, the first method for full global 3D hand tracking from in-the-wild RGB video, and a method for simultaneous pose and dense shape reconstruction of two interacting hands that, for the first time, combines a set of desirable properties previously unseen in the literature.
Real-time 3D Human Body Pose Estimation from Monocular RGB Input
(Saarländische Universitäts-und Landesbibliothek, 2020-10) Mehta, Dushyant;
Human motion capture finds extensive application in movies, games, sports and biomechanical analysis. However, existing motion capture solutions require cumbersome external and/or on-body instrumentation, or use active sensors with limits on the possible capture volume dictated by power consumption. The ubiquity and ease of deployment of RGB cameras makes monocular RGB based human motion capture an extremely useful problem to solve, which would lower the barrier-to entry for content creators to employ motion capture tools, and enable newer applications of human motion capture. This thesis demonstrates the first real-time monocular RGB based motion-capture solutions that work in general scene settings. They are based on developing neural network based approaches to address the ill-posed problem of estimating 3D human pose from a single RGB image, in combination with model based fitting. In particular, the contributions of this work make advances towards three key aspects of real-time monocular RGB based motion capture, namely speed, accuracy, and the ability to work for general scenes. New training datasets are proposed, for single-person and multi-person scenarios, which, together with the proposed transfer learning based training pipeline, allow learning based approaches to be appearance invariant. The training datasets are accompanied by evaluation benchmarks with multiple avenues of fine-grained evaluation. The evaluation benchmarks differ visually from the training datasets, so as to promote efforts towards solutions that generalize to in-the-wild scenes. The proposed task formulations for the single-person and multi-person case allow higher accuracy, and incorporate additional qualities such as occlusion robustness, that are helpful in the context of a full motion capture solution. The multi-person formulations are designed to have a nearly constant inference time regardless of the number of subjects in the scene, and combined with contributions towards fast neural network inference, enable real-time 3D pose estimation for multiple subjects. Combining the proposed learning-based approaches with a model-based kinematic skeleton fitting step provides temporally stable joint angle estimates, which can be readily employed for driving virtual characters.
Reconstructing 3D Human Avatars from Monocular Images
(2020) Alldieck, Thiemo;
Modeling 3D virtual humans has been an active field of research over the last decades. It plays a fundamental role in many applications, such as movie production, sports and medical sciences, or human-computer interaction. Early works focus on artist-driven modeling or utilize expensive scanning equipment. In contrast, our goal is the fully automatic acquisition of personalized avatars using low-cost monocular video cameras only. In this dissertation, we show fundamental advances in 3D human reconstruction from monocular images. We solve this challenging task by developing methods that effectively fuse information from multiple points in time and realistically complete reconstructions from sparse observations. Given a video or only a single photograph of a person in motion, we reconstruct, for the first time, not only his or her 3D pose but the full 3D shape including the face, hair, and clothing. In this dissertation, we explore various approaches to monocular image and video-based 3D human reconstruction. We demonstrate both straight-forward and sophisticated reconstruction methods focused on accuracy, simplicity, usability, and visual fidelity. During extensive evaluations, we give insights into important parameters, reconstruction quality, and the robustness of the methods. For the first time, our methods enable camera-based, easy-to-use self-digitization for exciting new applications like, for example, telepresence or virtual try-on for online fashion shopping.
Sensor Applications for Human Activity Recognition in Smart Environments
(2020-11-17) Fu, Biying;
Human activity recognition (HAR) is the automated recognition of individual or group activities from sensor inputs. It deals with a wide range of application areas, such as for health care, assisting technologies, quantified-self and safety applications. HAR is the key to build human-centred applications and enables users to seamlessly and naturally interact with each other or with a smart environment. A smart environment is an instrumented room or space equipped with sensors and actuators to perceive the physical state or human activities within this space. The diversity of sensors makes it difficult to use the appropriate sensor to build specific applications. This work aims at presenting sensor-driven applications for human activity recognition in smart environments by using novel sensing categories beyond the existing sensor technologies commonly applied to these tasks. The intention is to improve the interaction for various sub-fields of human activities. Each application addresses the difficulties following the typical process pipeline for designing a smart environment application. At first, I survey most prominent research works with focus on sensor-driven categorization in the research domain of HAR to identify possible research gaps to position my work. I identify two use-cases: quantified-self and smart home applications. Quantified-self aims at self-tracking and self-knowledge through numbers. Common sensor technology for daily tracking of various aerobic endurance training activities, such as walking, running or cycling are based on acceleration data with wearable. However, more stationary exercises, such as strength-based training or stretching are also important for a healthy life-style, as they improve body coordination and balance. These exercises are not well tracked by wearing only a single wearable sensor, as these activities rely on coordinated movement of the entire body. I leverage two sensing categories to design two portable mobile applications for remote sensing of these more stationary exercises of physical workout. Sensor-driven applications for smart home domain aim at building systems to make the life of the occupants safer and more convenient. In this thesis, I target at stationary applications to be integrated into the environment to allow a more natural interaction between the occupant and the smart environment. I propose two possible solutions to achieve this task. The first system is a surface acoustic based system which provides a sparse sensor setup to detect a basic set of activities of daily living including the investigation of minimalist sensor arrangement. The second application is a tag-free indoor positioning system. Indoor localization aims at providing location information to build intelligent services for smart homes. Accurate indoor position offers the basic context for high-level reasoning system to achieve more complex contexts. The floor-based localization system using electrostatic sensors is scalable to different room geometries due to its layout and modular composition. Finally, privacy with non-visual input is the main aspect for applications proposed in this thesis. In addition, this thesis addresses the issue of adaptivity from prototypes towards real-world applications. I identify the issues of data sparsity in the training data and data diversity in the real-world data. In order to solve the issue of data sparsity, I demonstrate the data augmentation strategy to be applied on time series to increase the amount of training data by generating synthetic data. Towards mitigating the inherent difference of the development dataset and the real-world scenarios, I further investigate several approaches including metric-based learning and fine-tuning. I explore these methods to finetune the trained model on limited amount of individual data with and without retrain the pre-trained inference model. Finally some examples are stated as how to deploy the offline model to online processing device with limited hardware resources.
The Smart Point Cloud: Structuring 3D intelligent point data
(ORBi, 2019-06-05) Poux, Florent;
Discrete spatial datasets known as point clouds often lay the groundwork for decision-making applications. E.g., we can use such data as a reference for autonomous cars and robot’s navigation, as a layer for floor-plan’s creation and building’s construction, as a digital asset for environment modelling and incident prediction... Applications are numerous, and potentially increasing if we consider point clouds as digital reality assets. Yet, this expansion faces technical limitations mainly from the lack of semantic information within point ensembles. Connecting knowledge sources is still a very manual and time-consuming process suffering from error-prone human interpretation. This highlights a strong need for domain-related data analysis to create a coherent and structured information. The thesis clearly tries to solve automation problematics in point cloud processing to create intelligent environments, i.e. virtual copies that can be used/integrated in fully autonomous reasoning services. We tackle point cloud questions associated with knowledge extraction – particularly segmentation and classification – structuration, visualisation and interaction with cognitive decision systems. We propose to connect both point cloud properties and formalized knowledge to rapidly extract pertinent information using domain-centered graphs. The dissertation delivers the concept of a Smart Point Cloud (SPC) Infrastructure which serves as an interoperable and modular architecture for a unified processing. It permits an easy integration to existing workflows and a multi-domain specialization through device knowledge, analytic knowledge or domain knowledge. Concepts, algorithms, code and materials are given to replicate findings and extend current applications.
Visual Analytics for Cooperative and Competitive Behavior in Team Sports
(2020-03) Stein, Manuel Dr.;
Automatic and interactive data analysis is instrumental in making use of increasing amounts of complex data. Owing to novel sensor modalities, analysis of data generated in professional team sport leagues such as soccer, handball, and basketball has recently become of concern, with high commercial and research interest. The analysis of team sports can serve many goals, for example, in coaching to understand the effects of strategies and tactics or to derive insights for improving performance. Also, it is often decisive for coaches and analysts to understand why a certain movement of a player or groups of players happened, and what the respective influencing factors were. We consider team sports as group movement including cooperation and competition of individuals following a specific set of rules. Analyzing team sports is a challenging problem as it involves joint understanding of heterogeneous data perspectives, including high-dimensional, video, and collective movement data, as well as considering team behavior and rules (constraints) given in the particular team sport. However, the discipline is in its infancy, largely restricted to commercial solutions developed out of necessity, while neglecting the movement context, with only a few academic contributions so far, and much room for improvement still exists. Consequently, the research in this dissertation happens at the intersection of several cutting-edge technologies, including computer vision and machine learning, data visualization, and human-computer interaction. All required research steps from data extraction and context enrichment to the visualization of cooperative and competitive behavior are covered in this thesis, enabling data acquisition and match analysis directly from existing video sources. The methods are capable of providing accurate analysis results both from a recording as well as in real time during a live match, improving and advancing the analytical possibilities of coaches and analysts in various invasive team sports. The impact of the presented methods is illustrated by highlighting how the application of proposed methods of this dissertation by the Austrian first league soccer club TSV Hartberg greatly improved their analysis process. Building on the foundations set by this dissertation will help to further revolutionize the way match analysis is being performed in the upcoming years. Ultimately, the progress enabled by research methods such as the introduced in-video visualization will not be limited to the domain of team sports analysis alone, but will have a general impact on how we visualize, see and perceive our data in the future.
Volumetric Subdivision for Efficient Integrated Modeling and Simulation
(2020-11-05) Altenhofen, Christian;
Continuous surface representations, such as B-spline and Non-Uniform Rational B-spline (NURBS) surfaces are the de facto standard for modeling 3D objects - thin shells and solid objects alike - in the field of Computer-Aided Design (CAD). For performing physically based simulation, Finite Element Analysis (FEA) has been the industry standard for many years. In order to analyze physical properties such as stability, aerodynamics, or heat dissipation, the continuous models are discretized into finite element (FE) meshes. A tight integration of and a smooth transition between geometric design and physically based simulation are key factors for an efficient design and engineering workflow. Converting a CAD model from its continuous boundary representation (B-Rep) into a discrete volumetric representation for simulation is a time-consuming process that introduces approximation errors and often requires manual interaction by the engineer. Deriving design changes directly from the simulation results is especially difficult as the meshing process is irreversible. Isogeometric Analysis (IGA) tries to overcome this meshing hurdle by using the same representation for describing the geometry and for performing the simulation. Most commonly, IGA is performed on bivariate and trivariate spline representations (B-spline or NURBS surfaces and volumes). While existing CAD B-Rep models can be used directly for simulating thin-shell objects, simulating solid objects requires a conversion from spline surfaces to spline volumes. As spline volumes need a trivariate tensor-product topology, complex 3D objects must be represented via trimming or by connecting multiple spline volumes, limiting the continuity to C^0. As an alternative to NURBS or B-splines, subdivision models allow for representing complex topologies with as a single entity, removing the need for trimming or tiling and potentially providing higher continuity. While subdivision surfaces have shown promising results for designing and simulating shells, IGA on subdivision volumes remained mostly unexplored apart from the work of Burkhart et al. In this dissertation, I investigate how volumetric subdivision representations are beneficial for a tighter integration of geometric modeling and physically based simulation. Focusing on Catmull-Clark (CC) solids, I present novel techniques in the areas of efficient limit evaluation, volumetric modeling, numerical integration, and mesh quality analysis. I present an efficient link to FEA, as well as my IGA approach on CC solids that improves upon Burkhart et al.'s proof of concept with constant-time limit evaluation, more accurate integration, and higher mesh quality. Efficient limit evaluation is a key requirement when working with subdivision models in geometric design, visualization, simulation, and 3D printing. In this dissertation, I present the first method for constant-time volumetric limit evaluation of CC solids. It is faster than the subdivision-based approach by Burkhart et al. for every topological constellation and parameter point that would require more than two local subdivision steps. Adapting the concepts of well-known surface modeling tools, I present a volumetric modeling environment for CC-solid control meshes. Consistent volumetric modeling operations built from a set of novel volumetric Euler operators allow for creating and modifying topologically consistent volumetric meshes. Furthermore, I show how to manipulate groups of control points via parameters, how to avoid intersections with inner control points while modeling the outer surface, and how to use CC solids in the context of multi-material additive manufacturing. For coupling of volumetric subdivision models with established FE frameworks, I present an efficient and consistent tetrahedral mesh generation technique for CC solids. The technique exploits the inherent volumetric structure of CC-solid models and is at least 26 times faster than the tetrahedral meshing algorithm provided by CGAL. This allows to re-create or update the tetrahedral mesh almost instantly when changing the CC-solid model. However, the mesh quality strongly depends on the quality of the control mesh. In the context of structural analysis, I present my IGA approach on CC solids. The IGA approach yields converging stimulation results for models with fewer elements and fewer degrees of freedom than FE simulations on tetrahedral meshes with linear and higher-order basis functions. The solver also requires fewer iterations to solve the linear system due to the higher continuity throughout the simulation model provided by the subdivision basis functions. Extending Burkhart et al.'s method, my hierarchical quadrature scheme for irregular CC-solid cells increases the accuracy of the integrals for computing surface areas and element stiffnesses. Furthermore, I introduce a quality metric that quantifies the parametrization quality of the limit volume, revealing distortions, inversions, and singularities. The metric shows that cells with multiple adjacent boundary faces induce singularities in the limit, even for geometrically well-shaped control meshes. Finally, I present a set of topological operations for splitting such boundary cells - resolving the singularities. These improvements further reduce the amount of elements required to obtain converging results as well as the time required for solving the linear system.

Browse

Browsing 2020 by Title

Results Per Page

Sort Options