2024
Permanent URI for this collection
Computational models of visual attention and gaze behavior in virtual reality
[meta data] [files: ]
Martin, Daniel
Photorealistic Simulation and Optimization of Lighting Conditions
[meta data] [files: ]
Vitsas, Nick
Intrinsic approaches to learning and computing on curved surfaces
[meta data] [files: ]
Wiersma, Ruben Timotheüs
Browse
Recent Submissions
Item Improving the efficiency of point cloud data management(TUprints, 2024-07) Bormann, PascalThe collection of point cloud data has increased drastically in recent years, which poses challenges for the data management layer. Multi-billion point datasets are commonplace and users are getting accustomed to real-time data exploration in the Web. To make this possible, existing point cloud data management approaches rely on optimized data formats which are time- and resource-intensive to generate. This introduces long wait times before data can be used and frequent data duplication, since these optimized formats are often domain- or application-specific. As a result, data management is a challenging and expensive aspect when developing applications that use point cloud data. We observe that the interaction between applications and the point cloud data management layer can be modeled as a series of queries similar to those found in traditional databases. Based on this observation, we evaluate current point cloud data management using three query metrics: Responsiveness, throughput, and expressiveness. We contribute to the current state of the art by improving these metrics for both the handling of raw files without preprocessing, as well as indexed point clouds. In the domain of unindexed point cloud data, we introduce the concept of ad-hoc queries, which are queries executed ad-hoc on raw point cloud files. We demonstrate that ad-hoc queries can improve query responsiveness significantly as they do not require long wait times for indexing or database imports. Using columnar memory layouts, queries on datasets of up to a billion points can be answered in interactive or near-interactive time, with throughputs of more than one hundred million points per second on unindexed data. A demonstration of an adaptive indexing method shows that spending a few seconds per query on index creation can improve responsiveness by up to an order of magnitude. Our experiments also confirm the importance of high-throughput systems when querying point cloud data, as the overhead of data transmission has a significant effect on the overall query performance. For situations where indexing is mandatory, we demonstrate improvements to the runtime performance of existing point cloud indexing tools. We developed a fast indexer based on task-parallel programming, using Morton indices to efficiently sort and distribute point batches onto worker threads. This system, called Schwarzwald, outperformed existing indexers by up to a factor 9 when it was first published, and still has competitive performance to current out-of-core capable indexers. Additionally we adapted our indexing algorithm for distributed processing in a Cloud-environment and demonstrate that its horizontal scalability allows it to outperform all existing indexers by up to a factor of 3. Lastly we demonstrated point cloud indexing in real-time during Light Detection And Ranging (LiDAR) capturing, based on a similar task-based algorithm but optimized for progressive indexing. Our real-time indexer is able to keep up with current LiDAR sensors in a real-world test, with end-to-end latencies as low as 0.1 seconds. Together, our improvements significantly reduce wait times for working with point cloud data and increase the overall efficiency of the data access layer.Item Visual Insights into Memory Behavior of GPU Ray Tracers(TUprints, 2024-07) von Buelow, MaxRay tracing is a fundamental rendering technique that typically projects three-dimensional representations of a scene onto a two-dimensional display. This is achieved by perspectively sampling a set of rays into the scene and computing intersections against the relevant geometry. Secondary rays may be sent out from these intersection points, allowing for physically correct global illumination on the reverse photon direction. Real-time rendering has historically used classical rasterization pipelines, which are straightforward to implement on hardware as they form a data-parallel problem projecting the whole scene into the coordinate system of the image. In contrast, task-parallel ray tracing suffers from incoherency between rays. However, recent advances in ray tracing have led to more efficient approaches, resulting in even more efficient embedded hardware implementations. While these approaches are already capable of rendering realistic images, further improvements in run-time performance can compensate for computational time to achieve higher framerates, display resolutions, ray-tracing recursion depths, or reducing the energy footprint of ray-tracing data centers. A fundamental technique for improving ray-tracing performance is the use of bounding-volume hierarchies (BVH), which prevent rays from intersecting the entire scene, especially in occluded or distant regions. In addition to the structural efficiency of a BVH, the primary bottlenecks of GPU ray tracing are memory latency and work distribution. These factors mainly result in more coherent memory accesses, making caching more efficient. Creating programs with the goal of achieving higher caching rates typically requires increased programming efforts and a deep understanding of the hardware, as an additional abstraction layer is introduced, making the memory pipeline less transparent. General-purpose profilers aim to support the implementation process. However, they typically display caching rates based on kernel calls. This is because these values are measured using basic hardware counters that do not distinguish between the context of a memory access. In many cases, it would be useful to have a more detailed representation of memory-related profiling metrics, such as the number of recordings per memory allocation or projections into other domains, such as the framebuffer or the scene geometry. This thesis presents a new method for simulating the GPU memory pipeline accurately. The method uses memory traces exported by dynamic binary instrumentation, which can be applied to any compiled GPU binaries, similar to standard profilers. The exported memory profiles can be used for performance visualization purposes in individual domains, as well as traditional memory profiling metrics that can be displayed in finer granularity than usual. A method for mapping memory metrics onto the original scene is included, allowing users to explore profiling results within the scene domain, making the profiling process more intuitive. In addition, this thesis presents a novel compressed ray-tracing implementation that optimizes its memory footprint by making assumptions about the topological properties of the scene to be rendered. The findings can be used to evaluate and optimize a wide range of ray tracing and ray marching applications in a user-friendly manner.Item Computational models of visual attention and gaze behavior in virtual reality(2024-03-08) Martin, DanielVirtual reality (VR) is an emerging medium that has the potential to unlock unprecedented experiences. Since the late 1960s, this technology has advanced steadily, and can nowadays be a gateway to a completely different world. VR offers a degree of realism, immersion, and engagement never seen before, and lately we have witnessed how newer virtual content is being continuously created. However, to get the most out of this promising medium, there is still much to learn about people’s visual attention and gaze behavior in the virtual universe. Questions like “What attracts users’ attention?” or “How malleable is the human brain when in a virtual experience?” have no definite answer yet. We argue that it is important to build a principled understanding of viewing and attentional behavior in VR. This thesis presents contributions in two key aspects: Understanding and modeling users’ gaze behavior, and leveraging imperceptible manipulations to improve the virtual experience. In the first part of this thesis we have focused on developing computational models of gaze behavior in virtual environments. First, and resorting to the well-known concept of saliency, we have devised models of user attention in 360o images and 360o videos that are able to predict which parts of a virtual scene are more likely to draw viewers’ attention. Then, we have designed another two computational models for spatio-temporal attention prediction, one of them able to simulate thousands of virtual observers per second by generating realistic sequences of gaze points in 360o images, and the other one predicting different, yet plausible sequences of fixations on traditional images. Additionally, we have explored how attention works in 3D meshes. All such models have allowed us to delve into the particularities of human gaze behavior under different environments. Besides that, we have aimed at achieving a deeper understanding on visual attention in multimodal environments. First, we have exhaustively reviewed a vast literature on the use of additional sensory modalities, like audio, haptics, or proprioception, in virtual reality - also known as multimodality -, and its role and benefits in several disciplines. Then, we have gathered and analyzed the largest dataset of viewing behavior in ambisonic 360o videos to date, finding effects on different factors like type of content, or gender, among others. We have finally analyzed how viewing behavior varies depending on the performed tasks: We have delved into attention in the very specific case of driving scenarios, and we have also studied how significant effects in gaze behavior can be found when performing different tasks in immersive environments. The second part of this thesis attempts to improve virtual experiences by means of imperceptible manipulations. We have firstly focused on lateral movement in VR, and have devised thresholds for the detection of such manipulations, which we then applied in three key problems in VR that have no definite solution yet, namely 6-DoF viewing of 3-DoF content, overcoming physical space constraints, and reducing motion sickness. On the other hand, we have explored the manipulation of the virtual scene, resorting to the phenomenon of change blindness, and have derived insights and guidelines on how to elicit or avoid such an effect, and how human brains’ limitations affect it.Item Task-Aware 3D Geometric Synthesis(University of Toronto, 2024) Sellán, SilviaThis thesis is about the different ways in which three-dimensional shapes come into digital existence inside a computer. Specifically, it argues that this geometric synthesis process should be tuned to the specific end for which an object is modeled or captured, and proposes building algorithms specific to said end. The majority of this thesis is dedicated to how 3D shapes are designed, and introduces changes to this modeling process to incorporate manufacturing constraints (e.g., that an object can physically be built out of a specific material or with a specific machine), precomputed simulation data (e.g., an object’s response to an impact) or specific user inputs (e.g., 3D drawing in Virtual or Augmented Reality). Importantly, these changes include rethinking the ways in which geometry is commonly represented, instead introducing formats that benefit specific applications, as well as efficient algorithms for converting between them. By contrast, the latter part of this thesis concerns itself with the task of capturing real-world 3D surfaces, a process that necessarily involves reconstructing continuous mathematical objects from imperfect, noisy and occluded discrete information. This thesis introduces a novel, stochastic lens from which to study this fundamentally underdetermined process, allowing for the introduction of task-specific priors as well as quantifying the uncertainty of common algorithmic predictions. This perspective is shown to provide critical insights in common 3D scanning paradigms. While geometric capture is the natural first step in which to introduce this statistical perspective, the thesis ends by enumerating other tasks further along the geometric processing pipeline that could benefit from it.Item Photorealistic Simulation and Optimization of Lighting Conditions(2024-05) Vitsas, NickLighting plays a very important role in our everyday life, affecting our safety, comfort, well-being and performance. Today, computational methods and tools can be applied to provide recommendations for improving light conditions and finding energy-efficient ways to exploit natural lighting. This thesis addresses the problem of computational optimization of light transport to improve lighting effectiveness, by improving on various aspects of the process, such as goal-driven parametric geometry configuration for building openings and interior design, efficient natural lighting sampling and interactive photorealistic simulation of light transport. Physically-based light transport is at the core of each task and we show how lighting evaluation has a broader application scope than image synthesis. In the domain of light-driven geometry optimization, the thesis makes two contributions, one concerning the opening design problem and one regarding the optimal arrangement of movable objects for interior design. Opening design comes at the early stages of architectural design. and concerns decisions about the geometric characteristics of windows, skylights, hatches, etc. It greatly impacts the overall energy efficiency, thermal profile, air flow and appearance of a building, both internally and externally. It also directly controls daylighting availability, which is very difficult to predict and assess without automatic tools. We developed a computational methodology and a system to automate the process of opening recommendations in a fully interactive virtual environment, fully supporting parametric constraints and illumination intentions. We optimize openings with respect to their shape, position, size and cardinality, based on Bayesian optimization to propose physically correct openings on the geometry of the building. For the light-driven interior design problem, we proposed and evaluated an automatic interior layout process to produce valid object arrangements guided by geometric and illumination constraints, optimizing for glare, correct illuminance levels and lighting uniformity. Geometric and lighting goals are combined into a cost function that allows for a hierarchical, stochastic exploration of the available space of valid configurations. Optimizing for the contribution of natural lighting is an integral part of any outdoor and indoor environment design process. Analytic formulas for clear skies are a computationally and memory efficient method to create physically accurate sky maps of clear sunny days. However, to simulate light transport, sky models must be efficiently sampled. This is typically done via standard importance sampling approaches for image-based lighting, which tend to be slow and wasteful for the predictable nature of the radiance distribution of analytic sky models. We propose and evaluate a method for fitting a truncated Gaussian mixture model on the radiance distribution of the sky map that is both compact and fast to evaluate. Light-driven geometry optimization requires both accurate and fast light transport evaluation, since a very large number of light-carrying paths needs to be evaluated at each new proposal state. Advances in graphics hardware have enabled interactive ray tracing, which relies on highly optimized data structures for the acceleration of ray queries. Bounding volume hierarchies based on axis-aligned bounding boxes have been the go-to data structure for fast ray-primitive intersections. Similar hierarchies of oriented bounding boxes (OBBs) provide much higher early hierarchy traversal termination rates, however their construction requires complex algorithms for the extraction of tight-fitting OBBs. To further accelerate ray tracing for our tasks, we properly adapt a high quality OBB extraction algorithm from unordered point sets to operate directly on existing hierarchies, to effectively construct an OBB tree on the GPU. By combining our method with existing fast algorithms from the literature that construct hierarchies in real-time, we are able to produce OBB trees that are extremely fast to build and traverse on the GPU. Furthermore, to allow for accurate light transport evaluators accessible as industry-grade tools, we developed and presented WebRays, the first generic ray intersection framework for the Web that offers a programming interface similar to modern ray tracing pipelines for desktop platforms and allows the implementation of light-driven design tools accessible from any platformItem Intrinsic approaches to learning and computing on curved surfaces(2024-10-15) Wiersma, Ruben TimotheüsThis dissertation develops intrinsic approaches to learning and computing on curved surfaces. Specifically, we work on three tasks: analyzing 3D shapes using convolutional neural networks (CNNs), solving linear systems on curved surfaces, and recovering appearance properties from curved surfaces using multi-view capture. We argue that we can find more efficient and better performing algorithms for these tasks by using intrinsic geometry. Chapter two and three consider CNNs on curved surfaces. We would like to find patterns with meaningful directional information, such as edges or corners. On images, it is straightforward to define a convolution operator that encodes directional information, as the pixel grid provides a global reference for directions. Such a global coordinate system is not available for curved surfaces. Chapter two presents Harmonic Surface Networks. We apply a 2D kernel to the surface by using local coordinate systems. These local coordinate systems could be rotated in any direction around the normal, which is a problem for consistent pattern recognition. We overcome this ambiguity by computing complex-valued, rotation-equivariant features and transporting these features between coordinate systems with parallel transport along shortest geodesics. Chapter three presents DeltaConv. DeltaConv is a convolution operator based on geometric operators from vector calculus, such as the Laplacian. A benefit of the Laplacian is that it is invariant to local coordinate systems. This solves the problem of a missing global coordinate system. However, the Laplacian operator is also isotropic. That means it cannot pick up on directional information. DeltaConv constructs anisotropic operators by splitting the Laplacian into gradient and divergence and applying a non-linearity in between. The resulting convolution operators are demonstrated on learning tasks for point clouds and achieve state-of-the-art results with a relatively simple architecture. Chapter four considers solving linear systems on curved surfaces. This is relevant for many applications in geometry processing: smoothing data, simulating or animating 3D shapes, or machine learning on surfaces. A common way to solve large systems on grid-based data is a multigrid method. Multigrid methods require a hierarchy of grids and the operators that map between the levels in the hierarchy. We show that these components can be defined for curved surfaces with irregularly spaced samples using a hierarchy of graph Voronoi diagrams. The resulting approach, Gravo Multigrid, achieves solving times comparable to the state-of-the-art, while taking an order of magnitude less time for pre-processing: from minutes to seconds for meshes with over a million vertices. Chapter five demonstrates the use of intrinsic geometry in the setting of appearance modeling, specifically capturing spatially-varying bidirectional reflectance distribution functions (SVBRDF). A low-cost setup to recover SVBRDFs is to capture photographs from multiple viewpoints. A challenge here, is that some reflectance behavior only shows up under certain viewing positions and lighting conditions, which means that we might not be able to tell one material type from another. We frame this as a question of (un)certainty: how certain are we, based on the input data? We build on previous work that shows that the reflection function can be modeled as a convolution of the BRDF with the incoming light. We propose improvements to the convolution model and develop algorithms for uncertainty analysis fully contained in the frequency domain. The result is a fast and uncertainty-aware SVBRDF recovery on curved surfaces.