Optimal Spatial Registration of SLAM for Augmented Reality
(2019-03-15) Wientapper, Folker
Augmented reality (AR) is a paradigm that aims at fusing the perceived real environment of a human with digital information located in 3D space. Typically, virtual 3D graphics are overlayed into the captured images of a moving camera or directly into the user's field-of-view by means of optical see-through displays (OST). For a correct perspective and view-dependent alignment of the visualization, it is required to solve various static and dynamic geometric registration problems in order to create the impression that the virtual and the real world are seamlessly interconnected. The advances during the last decade in the field of simultaneous localization and mapping (SLAM) represent an important contribution to this general problem. It is now possible to reconstruct the real environment and to simultaneously capture the dynamic movements of a camera from the images without having to instrument the environment in advance. However, SLAM in general can only partly solve the entire registration problem, because the retrieved 3D scene geometry and the calculated motion path are spatially related only with regard to an arbitrarily selected coordinate system. Without a proper reconciliation of coordinate systems (spatial registration), the real world of the human observer still remains decoupled from the virtual world. Existing approaches for solving this problem either require the availability of a virtual 3D model that represents a real object with sufficient accuracy (model-based tracking), or they rely on use-case specific assumptions and additional sensor data (such as GPS signals or the Manhattan-world assumption). Therefore, these approaches are bound to these additional prerequisites, which limit the general applicability. The circumstance that automated registration is desirable but not always possible, creates the need for techniques that allow a user to specify connections between the real and the virtual world when setting up AR applications, so that it becomes possible to support and control the process of registration. These techniques must be complemented with numerical algorithms that optimally exploit the provided information to obtain precise registration results. Within this context, the present thesis provides the following contributions. * We propose a novel, closed-form (non-iterative) algorithm for calculating a Euclidean or a similarity transformation. The presented algorithm is a generalization of recent state-of-the-art solvers for computing the camera pose based on 2D measurement points in the image (perspective-n-point problem) - a fundamental problem in computer vision that has attracted research for many decades. The generalization consists in extending and unifying these algorithms, so that they can handle other types of input correspondences than originally designed for. With this algorithm, it becomes possible to perform a rigid registration of SLAM systems to a target coordinate system based on heterogeneous and partially indeterminate input data. * We address the global refinement of structure and motion parameters by means of iterative sparse minimization (bundle adjustment or BA), which has become a standard technique inside SLAM systems. We propose a variant of BA in which information about the virtual domain is integrated as constraints by means of an optimization-on-manifold approach. This serves for compensating low-frequency deformations (non-rigid registration) of the estimated camera path and the reconstructed scene geometry caused by measurement error accumulation and the ill-conditionedness of the BA problem. * We present two approaches in which a user can contribute with his knowledge for registering a SLAM system. In a first variant, the user can place markers in the real environment with predefined connections to the virtual coordinate system. Precise positioning of the markers is not required, rather they can be placed arbitrarily on surfaces or along edges, which notably facilitates the preparative effort. During run-time, the dispersed information is collected and registration is accomplished automatically. In a second variant, the user is given the possibility to mark salient points in an image sequence during a preparative preprocessing step and to assign corresponding points in the virtual 3D space via a simple point-and-click metaphor. The result of this preparative phase is a precisely registered and ready-to-use reference model for camera tracking at run-time. * Finally, we propose an approach for geometric calibration of optical see-trough displays. We present a parametric model, which allows to dynamically adapt the rendering of virtual 3D content to the current viewpoint of the human observer, including a pre-correction of image aberrations caused by the optics or irregularly curved combiners. In order to retrieve its parameters, we propose a camera-based approach, in which elements of the real and the virtual domain are simultaneously observed. The calibration procedure was developed for a head-up display in a vehicle. A prototypical extension to head-mounted displays is also presented.
Model Reduction for Interactive Geometry Processing
(n/a, 2019-04-01) Brandt, Christopher
The research field of geometry processing is concerned with the representation, analysis, modeling, simulation and optimization of geometric data. In this thesis, we introduce novel techniques and efficient algorithms for problems in geometry processing, such as the modeling and simulation of elastic deformable objects, the design of tangential vector fields or the automatic generation of spline curves. The complexity of the geometric data determines the computation time of algorithms within these applications. The high resolution of modern meshes, for example, poses a big challenge when geometric processing tools are expected to perform at interactive rates. To this end the goal of this thesis is to introduce fast approximation techniques for problems in geometry processing. One line of research to achieve this goal will be to introduce novel model order reduction techniques to problems in geometry processing. Model order reduction is a concept to reduce the computational complexity of models in numerical simulations, energy optimizations and modeling problems. New specialized model order reduction approaches are introduced and existing techniques are applied to enhance tools within the field of geometry processing. In addition to introducing model reduction techniques, we make several other contributions to the field. We present novel discrete differential operators and higher order smoothness energies for the modeling of tangential n-vector fields. These are used, to develop novel tools for the modeling of fur, stroke based renderings or anisotropic reflection properties on meshes. We propose a geometric flow for curves in shape space that allows for the processing and creation of animations of elastic deformable objects. A new optimization scheme for sparsity regularized functionals is introduced and used to compute natural, localized deformations of geometrical objects. Lastly, we reformulate the classical problem of spline optimization as a sparsity regularized optimization problem.
Parametric Procedural Models for 3D Object Retrieval, Classification and Parameterization
(2019-04-05) Getto, Roman
The amount of 3D objects has grown over the last decades, but we can expect that it will grow much further in the future. 3D objects are also becoming more and more accessible to non-expert users. The growing amount of available 3D data is welcome for everyone working with this type of data, as the creation and acquisition of many 3D objects is still costly. However, the vast majority of available 3D objects are only present as pure polygon meshes. We arguably can not assume to get meta-data and additional semantics delivered together with 3D objects stemming from non-expert or 3D scans of real objects from automatic systems. For this reason content-based retrieval and classification techniques for 3D objects has been developed. Many systems are based on the completely unsupervised case. However, previous work has shown that there are strong possibilities of highly increasing the performance of these tasks by using any type of previous knowledge. In this thesis I use procedural models as previous knowledge. Procedural models describe the construction process of a 3D object instead of explicitly describing the components of the surface. These models can include parameters into the construction process to generate variations of the resulting 3D object. Procedural representations are present in many domains, as these implicit representations are vastly superior to any explicit representation in terms of content generation, flexibility and reusability. Therefore, using a procedural representation always has the potential of outclassing other approaches in many aspects. The usage of procedural models in 3D object retrieval and classification is not highly researched as this powerful representation can be arbitrary complex to create and handle. In the 3D object domain, procedural models are mostly used for highly regularized structures like buildings and trees. However, Procedural models can deeply improve 3D object retrieval and classification, as this representation is able to offer a persistent and reusable full description of a type of object. This description can be used for queries and class definitions without any additional data. Furthermore, the initial classification can be improved further by using a procedural model: A procedural model allows to completely parameterize an unknown object and further identify characteristics of different class members. The only drawback is that the manual design and creation of specialized procedural models itself is very costly. In this thesis I concentrate on the generalization and automation of procedural models for the application in 3D object retrieval and 3D object classification. For the generalization and automation of procedural models I propose to offer different levels of interaction for a user to fulfill the possible needs of control and automation. This thesis presents new approaches for different levels of automation: the automatic generation of procedural models from a single exemplary 3D object. The semi-automatic creation of a procedural model with a sketch-based modeling tool. And the manual definition a procedural model with restricted variation space. The second important step is the insertion of parameters into the procedural model, to define the variations of the resulting 3D object. For this step I also propose several possibilities for the optimal level of control and automation: An automatic parameter detection technique. A semi-automatic deformation based insertion. And an interface for manually inserting parameters by choosing one of the offered insertion principles. It is also possible to manually insert parameters into the procedures if the user needs the full control on the lowest level. To enable the usage of procedural models directly for 3D object retrieval and classification techniques I propose descriptor-based and deep learning based approaches. Descriptors measure the difference of 3D objects. By using descriptors as comparison algorithm, we can define the distance between procedural models and other objects and order these by similarity. The procedural models are sampled and compared to retrieve an optimal object retrieval list. We can also directly use procedural models as data basis for a retraining of a convolutional neural network. By deep learning a set of procedural models we can directly classify new unknown objects without any further large learning database. Additionally, I propose a new multi-layered parameter estimation approach using three different comparison measures to parameterize an unknown object. Hence, an unknown object is not only classified with a procedural model but the approach is also able to gather new information about the characteristics of the object by using the procedural model for the parameterization of the unknown object. As a result, the combination of procedural models with the tasks of 3D object retrieval and classification lead to a meta concept of a holistically seamless system of defining, generating, comparing, identifying, retrieving, recombining, editing and reusing 3D objects.
Computational Design of Auxetic Shells
(2019-07-18) Konakovic Lukovic, Mina
Recent advances in material science and digital fabrication provide promising opportunities for product design, mechanical and biomedical engineering, robotics, architecture, art, and science. Engineered materials and personalized fabrication are revolutionizing manufacturing culture and having a significant impact on various scientific and industrial works. As new fabrication technologies emerge, effective computational tools are needed to fully exploit the potential of digital fabrication. This thesis introduces a novel computational method for design and fabrication with auxetic materials. The term auxetic refers to solid materials with negative Poisson ratio — when the material is stretched in one direction, it also expands in all other directions. In particular, we study 2D auxetic materials in the form of a triangular linkage which exhibits auxetic behavior at the macro scale. This stretching, in turn, allows the flat material to approximate doubly-curved surfaces, making it attractive for fabrication. We physically realize auxetic materials by introducing a specific pattern of cuts into approximately inextensible material such as sheet metal, plastic, or leather. On a larger scale, we use individual rigid triangular elements and connect them with joints. First, this thesis focuses on a regular triangular linkage. When deformed into a curved shape, the linkage yields spatially-varying hexagonal openings. However, the global coupling of the linkage elements makes manual, incremental approach unlikely to succeed when trying to approximate a given curved surface. Thus, we leverage conformal geometry to enable complex surface design. In particular, we compute a global conformal map with bounded scale factor to initialize an otherwise intractable non-linear optimization. Constraint-based optimization is used to find the final linkage configuration that closely approximates a target 3D surface. Furthermore, we develop a computational method for designing novel deployable structures via programmable auxetics, i.e., spatially varying triangular linkage optimized to directly and uniquely encode the target 3D surface in the 2D pattern. The target surface is rapidly deployed from a flat initial state via inflation or gravitational loading. The thesis presents both inverse and forward design tools for interactive surface design with programmable auxetics. This allows the user to efficiently approximate a given shape and directly edit and adapt the auxetic linkage structure to explore the design alternatives. In addition, our solution enables simulation-based form-finding that uses deployment forces for interactive exploration of feasible shapes. The resulting designs can be easily manufactured via digital fabrication technologies such as laser cutting, CNC milling, or 3D printing. Auxetic materials and deployable structures enable scientific, industrial, and consumer applications across a wide variety of scales and usages. We validate our computational methods through a series of physical prototypes and application case studies, ranging from surgical implants, through art pieces, to large-scale architectural structures.
Artificial Intelligence for Efficient Image-based View Synthesis
(2019-06-24) Leimkühler, Thomas
Synthesizing novel views from image data is a widely investigated topic in both computer graphics and computer vision, and has many applications like stereo or multi-view rendering for virtual reality, light field reconstruction, and image post-processing. While image-based approaches have the advantage of reduced computational load compared to classical model-based rendering, efficiency is still a major concern. This thesis demonstrates how concepts and tools from artificial intelligence can be used to increase the efficiency of image-based view synthesis algorithms. In particular it is shown how machine learning can help to generate point patterns useful for a variety of computer graphics tasks, how path planning can guide image warping, how sparsity-enforcing optimization can lead to significant speedups in interactive distribution effect rendering, and how probabilistic inference can be used to perform real-time 2D-to-3D conversion.
Soft Segmentation of Images
(ETH Zurich, 2019) Aksoy, Yagiz
Realistic editing of photographs requires careful treatment of color mixtures that commonly occur in natural scenes. These color mixtures are typically modeled using soft selection of objects or scene colors. Hence, accurate representation of these soft transitions between image regions is essential for high-quality image editing and compositing. Current techniques for generating such representations depend heavily on interaction by a skilled visual artist, as creating such accurate object selections is a tedious task. In this thesis, we approach the soft segmentation problem from two complementary properties of a photograph. Our first focus is representing images as a mixture of main colors in the scene, by estimating soft segments of homogeneous colors. We present a robust per-pixel nonlinear optimization formulation while simultaneously targeting computational efficiency and high accuracy. We then turn our attention to semantics in a photograph and present our work on soft segmentation of particular objects in a given scene. This work features graph-based formulations that specifically target the accurate representation of soft transitions in linear systems. Each part first presents an interactive segmentation scheme that targets applications popular in professional compositing and movie post-production. The interactive formulations are then generalized to the automatic estimation of generic image representations that can be used to perform a number of otherwise complex image editing tasks effortlessly. The first problem studied is green-screen keying, interactive estimation of a clean foreground layer with accurate opacities in a studio setup with a controlled background, typically set to be green. We present a simple two-step interaction scheme to determine the main scene colors and their locations. The soft segmentation of the foreground layer is done via the novel color unmixing formulation, which can effectively represent a pixel color as a mixture of many colors characterized by statistical distributions. We show our formulation is robust against many challenges in green-screen keying and can be used to achieve production-quality keying results at a fraction of the time compared to commercial software. We then study soft color segmentation, estimation of layers with homogeneous colors and corresponding opacities. The soft color segments can be overlayed to give the original image, providing effective intermediate representation of an image. We decompose the global energy optimization formulation that typically models the soft color segmentation task into three sub-problems that can be implemented with computational efficiency and scalability. Our formulation gets its strength from the color unmixing energy, which is essential in ensuring homogeneous layer colors and accurate opacities. We show that our method achieves a segmentation quality that allows realistic manipulation of colors in natural photographs. Natural image matting is the generalized version of green-screen keying, where an accurate estimation of foreground opacities is targeted in an unconstrained setting. We approach this problem using a graph-based approach, where we model the connections in the graph as forms of information flow that distributes the information from the user input into the whole image. By carefully defining information flows to target challenging regions in complex foreground structures, we show that high-quality soft segmentation of objects can be estimated through a closed-form solution of a linear system. We extend our approach to related problems in natural image matting such as matte refinement and layer color estimation and demonstrate the effectiveness of our formulation through quantitative, qualitative and theoretical analysis. Finally, we introduce semantic soft segments, a set of layers that correspond to semantically meaningful regions in an image with accurate soft transitions between different objects. We approach this problem from a spectral segmentation angle and propose a graph structure that embeds texture and color features from the image as well as higher-level semantic information generated by a neural network. The soft segments are generated via eigendecomposition of the carefully constructed Laplacian matrix fully automatically. We demonstrate that compositing and targeted image editing tasks can be done with little effort using semantic soft segments.
Model-based human performance capture in outdoor scenes
(2019-05-21) Robertini, Nadia
Technologies for motion and performance capture of real actors have enabled the creation of realisticlooking virtual humans through detail and deformation transfer at the cost of extensive manual work and sophisticated in-studio marker-based systems. This thesis pushes the boundaries of performance capture by proposing automatic algorithms for robust 3D skeleton and detailed surface tracking in less constrained multi-view outdoor scenarios. Contributions include new multi-layered human body representations designed for effective model-based time-consistent reconstruction in complex dynamic environments with varying illumination, from a set of vision cameras. We design dense surface refinement approaches to enable smooth silhouette-free model-to-image alignment, as well as coarse-to-fine tracking techniques to enable joint estimation of skeleton motion and finescale surface deformations in complicated scenarios. High-quality results attained on challenging application scenarios confirm the contributions and show great potential for the automatic creation of personalized 3D virtual humans.
Advances on computational imaging, material appearance, and virtual reality
(Universidad de Zaragoza, 2019-04-29) Serrano, Ana
Visual computing is a recently coined term that embraces many subfields in computer science related to the acquisition, analysis, or synthesis of visual data through the use of computer resources. What brings all these fields together is that they are all related to the visual aspects of computing, and more importantly, that during the last years they have started to share similar goals and methods. This thesis presents contributions in three different areas within the field of visual computing: computational imaging, material appearance, and virtual reality. The first part of this thesis is devoted to computational imaging, and in particular to rich image and video acquisition. First, we deal with the capture of high dynamic range images in a single shot, where we propose a novel reconstruction algorithm based on sparse coding and reconstruction to recover the full range of luminances of the scene being captured from a single coded low dynamic range image. Second, we focus on the temporal domain, where we propose to capture high speed videos via a novel reconstruction algorithm, again based on sparse coding, that allows recovering high speed video sequences from a single photograph with encoded temporal information. The second part attempts to address the long-standing problem of visual perception and editing of real world materials. We propose an intuitive, perceptually based editing space for captured data. We derive a set of meaningful attributes for describing appearance, and we build a control space based on these attributes by means of a large scale user study. Finally, we propose a series of applications for this space. One of these applications to which we devote particular attention is gamut mapping. The range of appearances displayable on a particular display or printer is called the gamut. Given a desired appearance, that may lie outside of that gamut, the process of gamut mapping consists on making it displayable without excessively distorting the final perceived appearance. For this task, we make use of our previously derived perceptually-based space to introduce visual perception in the mapping process to help minimize the perceived visual distortions that may arise during the mapping process. The third part is devoted to virtual reality. We first focus on the study of human gaze behavior in static omnistereo panoramas. We collect gaze samples and we provide an analysis of this data, proposing then a series of applications that make use of our derived insights. Then, we investigate more intricate behaviors in dynamic environments under a cinematographic context. We gather gaze data from viewers watching virtual reality videos containing different edits with varying parameters, and provide the first systematic analysis of viewers’ behavior and the perception of continuity in virtual reality video. Finally, we propose a novel method for adding parallax for 360◦ video visualization in virtual reality headsets.
Practical Measurement-based Modeling and Rendering of Surface Diffraction
(2019) Toisoul, Antoine
Computer graphics have evolved at a very fast pace over the last forty years. Most of the research in rendering has been focused on recreating visual effects that can be explained with geometric optics, such as reflections from diffuse and specular surfaces, refraction and volumetric scattering. Although geometric optics cover a wide range of effects related to light transport, some very impressive and colourful effects can only be explained and rendered with wave optics. This is the case of diffraction of light. Diffraction is a very common effect that causes dispersion of light i.e., the decomposition of white light into colourful patterns on a surface. It is caused by interferences between light waves when the geometry of a surface reaches a size below the coherence length of white light (around 65 micrometers). The most famous example of a diffractive surface is probably a Compact Disc on which the bits of information are stored along tracks that are small enough to diffract light. In this thesis, we present novel approaches to generate photorealistic computer renderings of diffraction of light from measurements of real-world surfaces. We present four practical measurement setups that employ commonly found hardware to acquire reflectance properties of both spatially-homogeneous diffractive surfaces and spatially-varying printed holographic surfaces. We also describe how such measurements can be employed in conjunction with a physically-based rendering model of diffraction to avoid Fourier optics simulations and therefore reduce the computational expense of diffraction rendering. Finally, we present techniques to render diffraction effects under arbitrary illumination at real-time framerates which is computationally very expensive with conventional techniques. These contributions constitute the first demonstration of realistic renderings of complex diffraction patterns observed in manufactured materials using practical measurement techniques at the interface of photography and optics. The algorithms presented in this thesis can be implemented in real-time applications such as video games and virtual reality experiences.
Efficient Light-Transport Simulation Using Machine Learning
(ETH Zürich, 2019) Müller, Thomas
The goal in this dissertation is the efficient synthesis of photorealistic images on a computer. Currently, by far the most popular approach for photorealistic image synthesis is path tracing, a Monte Carlo simulation of the integral equations that describe light transport. We investigate several data-driven approaches for improving the convergence of path tracing, leveraging increasingly sophisticated machine-learning models. Our first approach focuses on the specific setting of "multiple scattering in translucent materials" whereas the following approaches operate in the more general "path-guiding" framework. The appearance of bright translucent materials is dominated by light that scatters beneath the material surface hundreds to thousands of times. We sidestep an expensive, repeated simulation of such long light paths by precomputing the large-scale characteristics of material-internal light transport, which we use to accelerate rendering. Our method employs "white Monte Carlo", imported from biomedical optics, to precompute in a single step the exitant radiance on the surface of large spherical shells that can be filled with a wide variety of translucent materials. Constructing light paths by utilizing these shells is similarly efficient as popular diffusion-based approaches while introducing significantly less error. We combine this technique with prior work on rendering granular materials such that heterogeneous arrangements of polydisperse grains can be rendered efficiently. The computational cost of path construction is not the only factor in rendering efficiency. Equally important is the distribution of constructed paths, because it determines the stochastic error of the simulation. We present two path-guiding techniques that aim to improve this distribution by systematically guiding paths towards scene regions with large energy contribution. To this end, we introduce a framework that learns a path construction scheme on line during rendering while optimally balancing the computational rendering and learning cost. In this framework, we use two novel path-generation models: a performance-optimized spatio-directional tree ("SD-tree") and a neural-network-based generative model that utilizes normalizing flows. Our SD-tree is designed to learn the 5-D light field in a robust manner, making it suitable for production environments. Our neural networks, on the other hand, are able to learn the full 7-D integrand of the rendering equation, leading to higher-quality path guiding, albeit at increased computational cost. Our neural architecture generalizes beyond light-transport simulation and permits importance sampling of other high-dimensional integration problems.
Neighborhood Data Structures, Manifold Properties, and Processing of Point Set Surfaces
(Refubium - Repositorum der Freien Universität Berlin, 2019-07-03) Skrodzki, Martin Dr.
The PhD thesis, titled “Efficient Coordinates for Point Set Surfaces”, is concerned with point sets acquired by 3D acquisition techniques and their processing. The thesis covers and advances both theoretical and applied aspects of the field. The first topic concerns notions of neighborhood and corresponding data structures. Despite their advantages in storage space and easy acquisition, the missing neighborhood relation is a significant downside to point set representations. The first part of the thesis establishes a novel neighborhood relation concept that relies on a bilateral weighting between Euclidean point distances and distances in the normal field. Experiments prove the superiority against combinatorial or purely metrical neighborhoods in the presence of noise or outliers. Furthermore, the first part of the thesis is concerned with data structures for the fast computation of neighborhoods. It contains several novel theorems on the neighborhood grid data structure which provides a fast and parallelizable means to compute approximations of neighborhoods. The thesis further contributes to the literature on neighborhood data structures by an accessible discussion of the main theorem on k-d trees. The second part of the thesis deals with manifold structures for point set surfaces. From a significant amount of real-world objects, while 3D-scanning those, only the surface is acquired for further processing in CAD or other applications. As the surface of the underlying real-world geometry has the structure of a manifold, it can be expected that this structure is reflected by any point set acquired from the geometry. The thesis establishes a scheme to interpret point sets in terms of their implicitly underlying manifolds. It describes a methodology based on the Moving Least Squares (MLS) framework to obtain both local coordinate charts and chart transition maps from a raw point cloud. Furthermore, it translates the approach of Variational Shape Approximation (VSA) to point sets. The most significant contribution at this end is the description of failure-cases of the previous VSA approach and the presentation of a VSA algorithm which terminates for all input cases. Third and finally, algorithms have to work efficiently and robustly on the input point set. While meshed geometries provide an intuitive and natural weighting by the areas of the faces, point sets can at most work with distances between the points. This introduces a new level of difficulty to be overcome by any point set processing algorithm. The third chapter of the thesis introduces a novel weighting scheme to counteract non-uniformity in point sets. Additionally, it draws from the MLS framework to build a feature detection algorithm with mathematical guarantees. A third application area is the denoising of point sets. In this field, the thesis presents an iterative denoising scheme to efficiently handle both isotropic and anisotropic cases. In summary, the thesis contributes to all aspects of the point set processing pipeline. Both theoretical advances and practical advances are made. All contributions are practically evaluated in the context of several application areas.
Locally Solving Linear Systems for Geometry Processing
(2019) Herholz, Philipp
Geometry processing algorithms commonly need to solve linear systems involving discrete Laplacians. In many cases this constitutes a central building block of the algorithm and dominates runtime. Usually highly optimized libraries are employed to solve these systems, however, they are built to solve very general linear systems. I argue that it is possible to create more efficient algorithms by exploiting domain knowledge and modifying the structure of these solvers accordingly. In this thesis I take a first step in that direction. The focus lies on Cholesky factorizations that are commonly employed in the context of geometry processing. More specifically I am interested in the solution of linear systems where variables are associated with vertices of a mesh. The central question is: Given the Cholesky factorization of a linear system defined on the full mesh, how can we efficiently obtain solutions for a local set of vertices, possibly with new boundary conditions? I present methods to achieve this without computing the value at all vertices or refactoring the system from scratch. Developing these algorithms requires a detailed understanding of sparse Cholesky factorizations and modifications of their implementation. The methods are analyzed and validated in concrete applications. Ideally this thesis will stimulates research in geometry processing and related fields to jointly develop algorithms and numerical methods rather than treating them as distinct blocks.
Viewpoint-Free Photography for Virtual Reality
(University College London, 2019-07-28) Hedman, Peter
Viewpoint-free photography, i.e., interactively controlling the viewpoint of a photograph after capture, is a standing challenge. In this thesis, we investigate algorithms to enable viewpoint-free photography for virtual reality (VR) from casual capture, i.e., from footage easily captured with consumer cameras. We build on an extensive body of work in image-based rendering (IBR). Given images of an object or scene, IBR methods aim to predict the appearance of an image taken from a novel perspective. Most IBR methods focus on full or near-interpolation, where the output viewpoints either lie directly between captured images, or nearby. These methods are not suitable for VR, where the user has significant range of motion and can look in all directions. Thus, it is essential to create viewpoint-free photos with a wide field-of-view and sufficient positional freedom to cover the range of motion a user might experience in VR. We focus on two VR experiences: 1) Seated VR experiences, where the user can lean in different directions. This simplifies the problem, as the scene is only observed from a small range of viewpoints. Thus, we focus on easy capture, showing how to turn panorama-style capture into 3D photos, a simple representation for viewpoint-free photos, and also how to speed up processing so users can see the final result on-site. 2) Room-scale VR experiences, where the user can explore vastly different perspectives. This is challenging: More input footage is needed, maintaining real-time display rates becomes difficult, view-dependent appearance and object backsides need to be modelled, all while preventing noticeable mistakes. We address these challenges by: (1) creating refined geometry for each input photograph, (2) using a fast tiled rendering algorithm to achieve real-time display rates, and (3) using a convolutional neural network to hide visual mistakes during compositing. Overall, we provide evidence that viewpoint-free photography is feasible from casual capture. We thoroughly compare with the state-of-the-art, showing that our methods achieve both a numerical improvement and a clear increase in visual quality for both seated and room-scale VR experiences.
3D scene analysis through non-visual cues
(University College London, 2019-10-06) Monszpart, Aron
The wide applicability of scene analysis from as few viewpoints as possible attracts the attention of many scientific fields, ranging from augmented reality to autonomous driving and robotics. When approaching 3D problems in the wild, one has to admit, that the problems to solve are particularly challenging due to a monocular setup being severely under-constrained. One has to design algorithmic solutions that resourcefully take advantage of abundant prior knowledge, much alike the way human reasoning is performed. I propose the utilization of non-visual cues to interpret visual data. I investigate, how making non-restrictive assumptions about the scene, such as “obeys Newtonian physics” or “is made by or for humans” greatly improves the quality of information retrievable from the same type of data. I successfully reason about the hidden constraints that shaped the acquired scene to come up with abstractions that represent likely estimates about the unobservable or difficult to acquire parts of scenes. I hypothesize, that jointly reasoning about these hidden processes and the observed scene allows for more accurate inference and lays the way for prediction through understanding. Applications of the retrieved information range from image and video editing (e.g., visual effects) through robotic navigation to assisted living.
Generating High-quality 3D Assets from Easy-to-access 2D content
(University College London, 2019-06-28) Wang, Yangtuanfeng
In the context of content creation, there is an increasing demand for high-quality digital models including object shape, texture, environment illumination, physical properties, etc. As design and pre-view presentations get exclusively digital, the need for high-quality 3D assets has grown sharply. The demand, however, is challenging to meet as the process of creating such digital 3D assets remains mostly manual – heavy post-processing is still needed to clean-up captures from commercial 3D capturing devices or models have to be manually created from scratch. On the other hand, low-quality 3D data is much easier to obtain, e.g., modeled by hand, captured with a low-end device, or generated using a virtual simulator. In this thesis, we develop algorithms that consume such low-quality 3D data and 2D cues to automatically create enriched 3D content of higher-quality. Specifically, with the help of low quality underlying 3D geometry, we explore (i) how to grab 3D shape from 2D images while factorizing camera motion and object motion in a dynamic scene; (ii) how to transfer texture and illumination from a captured 2D image to 3D shapes of the same category; (iii) how to decompose 360 environment map and BRDF material from photos and reduce ambiguity by joint observation; and (iv) how to model 3D garment shape and its physical properties from a 2d sketch or image.
GPU Data Structures and Code Generation for Modeling, Simulation, and Visualization
(2019-12-16) Mueller-Roemer, Johannes Sebastian
Virtual prototyping, the iterative process of using computer-aided (CAx) modeling, simulation, and visual-ization tools to optimize prototypes and products before manufacturing the first physical artifact, plays anincreasingly important role in the modern product development process. Especially due to the availabilityof affordable additive manufacturing (AM) methods (3D printing), it is becoming increasingly possible tomanufacture customized products or even for customers to print items for themselves. In such cases, thefirst physical prototype is frequently the final product.In this dissertation, methods to efficiently parallelize modeling, simulation, and visualization operationsare examined with the goal of reducing iteration times in the virtual prototyping cycle, while simulta-neously improving the availability of the necessary CAx tools. The presented methods focus on paral-lelization on programmable graphics processing units (GPUs). Modern GPUs are fully programmable mas-sively parallel manycore processors that are characterized by their high energy efficiency and good price-performance ratio. Additionally, GPUs are already present in many workstations and home computers dueto their use in computer-aided design (CAD) and computer games. However, specialized algorithms anddata structures are required to make efficient use of the processing power of GPUs.Using the novel GPU-optimized data structures and algorithms as well as the new applications of compilertechnology introduced in this dissertation, speedups between approximately one (10×) and more thantwo orders of magnitude (> 100×) are achieved compared to the state of the art in the three core areasof virtual prototyping. Additionally, memory use and required bandwidths are reduced by up to nearly86%. As a result, not only can computations on existing models be executed more efficiently but largermodels can be created and processed as well.In the area of modeling, efficient discrete mesh processing algorithms are examined with a focus on volu-metric meshes. In the field of simulation, the assembly of the large sparse system matrices resulting fromthe finite element method (FEM) and the simulation of fluid dynamics are accelerated. As sparse matri-ces form the foundation of the presented approaches to mesh processing and simulation, GPU-optimizedsparse matrix data structures and hardware- and domain-specific automatic tuning of these data struc-tures are developed and examined as well. In the area of visualization, visualization latencies in remotevisualization of cloud-based simulations are reduced by using an optimizing query compiler. By usinghybrid visualization, various user interactions can be performed without network round trip latencies.
Quad Meshes as Optimized Architectural Freeform Structures
(2019-10) Pellis, Davide
This thesis tackles the design of freeform surface-like and load-bearing structures realized with cladding panels and supported by a framework substructure, often called gridshells. The actual fabrication of freeform gridshells is a challenging task, and easily leads to unsustainable costs. A well known strategy to realize a gridshell is to use as layout a so-called principal mesh. This is a quadrilateral mesh whose edges follow the principal curvature directions of a continuous surface. We achieve in this way flat cladding panels and a substructure with simplified connections. This thesis shows that quadrilateral meshes, besides allowing manufacturing simplification, are also optimal solutions both for static performance and smooth visual appearance. In particular, we show that the best static performance is achieved for quad meshes discretizing membranes along principal stress lines, and we get an absolute minimum on such membranes where the integral of absolute principal stresses is minimal. We also show that the best smooth visual appearance is achieved for principal meshes; the absolute minimum is now reached for principal meshes discretizing surfaces where the integral of absolute principal curvatures is minimal. Therefore, from membranes where stress and curvature directions are aligned, and where the total absolute stress is minimal, we can extract principal meshes with the best static performance and with optimal visual appearance. We present then computational tools for the design of such highly efficient gridshells.
Lightweight material acquisition using deep learning
(2019-11) Deschaintre, Valentin
Whether it is used for entertainment or industrial design, computer graphics is ever more present in our everyday life. Yet, reproducing a real scene appearance in a virtual environment remains a challenging task, requiring long hours from trained artists. A good solution is the acquisition of geometries and materials directly from real world examples, but this often comes at the cost of complex hardware and calibration processes. In this thesis, we focus on lightweight material appearance capture to simplify and accelerate the acquisition process and solve industrial challenges such as result image resolution or calibration. Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. We explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Once trained, our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by the environment or a hand-held flash. We show how our method improves its prediction with the number of input pictures to reach high quality reconstructions with up to 10 images - a sweet spot between existing single-image and complex multi-image approaches - and allows to capture large scale, HD materials. We achieve this goal by introducing several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture.
Revealing the Invisible: On the Extraction of Latent Information from Generalized Image Data
(Universitäts- und Landesbibliothek Bonn, 2020-01-08) Iseringhausen, Julian Dr.
The desire to reveal the invisible in order to explain the world around us has been a source of impetus for technological and scientific progress throughout human history. Many of the phenomena that directly affect us cannot be sufficiently explained based on the observations using our primary senses alone. Often this is because their originating cause is either too small, too far away, or in other ways obstructed. To put it in other words: it is invisible to us. Without careful observation and experimentation, our models of the world remain inaccurate and research has to be conducted in order to improve our understanding of even the most basic effects. In this thesis, we are going to present our solutions to three challenging problems in visual computing, where a surprising amount of information is hidden in generalized image data and cannot easily be extracted by human observation or existing methods. We are able to extract the latent information using non-linear and discrete optimization methods based on physically motivated models and computer graphics methodology, such as ray tracing, real-time transient rendering, and image-based rendering.

Browse

Recent Submissions

Results Per Page

Sort Options