High-Performance Graphics 2015
Permanent URI for this collection
Browse
Browsing High-Performance Graphics 2015 by Subject "GPU"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item An Adaptive Acceleration Structure for Screen-space Ray Tracing(ACM Siggraph, 2015) Widmer, S.; Pajak, D.; Schulz, A.; Pulli, K.; Kautz, J.; Goesele, M.; Luebke, D.; Petrik Clarberg and Elmar EisemannWe propose an efficient acceleration structure for real-time screenspace ray tracing. The hybrid data structure represents the scene geometry by combining a bounding volume hierarchy with local planar approximations. This enables fast empty space skipping while tracing and yields exact intersection points for the planar approximation. In combination with an occlusion-aware ray traversal our algorithm is capable to quickly trace even multiple depth layers. Compared to prior work, our technique improves the accuracy of the results, is more general, and allows for advanced image transformations, as all pixels can cast rays to arbitrary directions. We demonstrate real-time performance for several applications, including depth-of-field rendering, stereo warping, and screen-space ray traced reflections.Item Bounding Volume Hierarchy Optimization through Agglomerative Treelet Restructuring(ACM Siggraph, 2015) Domingues, Leonardo R.; Pedrini, Helio; Petrik Clarberg and Elmar EisemannIn this paper, we present a new method for building high-quality bounding volume hierarchies (BVHs) on manycore systems. Our method is an extension of the current state-of-the-art on GPU BVH construction, Treelet Restructuring Bounding Volume Hierarchy (TRBVH), and consists of optimizing an already existing tree by rearranging subsets of its nodes using a bottom-up agglomerative clustering approach. We implemented our solution for the NVIDIA Kepler architecture using CUDA and tested it on 16 distinct scenes, most of which are commonly used to evaluate the performance of acceleration structures. We show that our implementation is capable of producing trees whose quality is on par with the ones generated by TRBVH for those scenes, while being about 30% faster to do so.Item Grid-Free Out-Of-Core Voxelization to Sparse Voxel Octrees on GPU(ACM Siggraph, 2015) Pätzold, Martin; Kolb, Andreas; Petrik Clarberg and Elmar EisemannIn this paper, we present the first grid-free, out-of-core GPU voxelization method. Our method combines efficient parallel triangle voxelization on GPU with out-of-core technologies in order to allow the processing of scenes with large triangle counts at a high resolution. We directly generate the voxelized data in a sparse voxel octree (SVO) representation, without any intermediate grid structure (''grid-free''). We apply triangle preprocessing and avoid atomic operations, thus leading to an optimized balanced GPU workload and efficient parallel triangle processing. Compared to existing out-of-core CPU approaches, we manage a proper handling of voxel attributes, i.e. all triangle attributes contributing to a voxel are accessible when calculating the voxel attribute. We test and compare our approach to state-of-the-art methods and demonstrate its viability in terms of speed, input triangle count, resolution and output quality.Item Reorder Buffer: An Energy-Efficient Multithreading Architecture for Hardware MIMD Ray Traversal(ACM Siggraph, 2015) Lee, Won-Jong; Shin, Youngsam; Hwang, Seok Joong; Kang, Seok; Yoo, Jeong-Joon; Ryu, Soojung; Petrik Clarberg and Elmar EisemannIn this paper, we present an energy- and area-efficient multithreading architecture for Multiple Instruction, Multiple Data (MIMD) ray tracing hardware targeted at low-power devices. Recent ray tracing hardware has predominantly adopted an MIMD approach for efficient parallel traversal of incoherent rays, and supports a multithreading scheme to hide latency and to resolve memory divergence. However, the conventional multithreading scheme has problems such as increased memory cost for thread storage and consumption of additional energy for bypassing threads to the pipeline. Consequently, we propose a new multithreading architecture called Reorder Buffer. Reorder Buffer solves these problems by constituting a dynamic reordering of the rays in the input buffer according to the results of cache accesses. Unlike conventional schemes, Reorder Buffer is cost-effective and energy-efficient because it does not need additional thread memory nor does it consume more energy because it makes use of existing resources. Simulation results show that our architecture is a potentially versatile solution for future ray tracing hardware in low-energy devices because it provides as much as 11.7% better cache utilization and is up to 4.7 times more energy-efficient than the conventional architecture.