Denoising Monte Carlo Renderings: a Sub-Pixel Exploration with Deep Learning

Zhang, Xianyao

Denoising Monte Carlo Renderings: a Sub-Pixel Exploration with Deep Learning

Files

thesis-20240527-nocv.pdf (139.54 MB)

Date

2024-05-29

Authors

Zhang, Xianyao

Publisher

ETH Zurich

Abstract

Monte Carlo rendering techniques, exemplified by path tracing, are able to faithfully capture the interaction between light and objects. Therefore, they have become the primary means for visual effects and animation films to rendering digital assets into frames. However, Monte Carlo rendering techniques require stochastic sampling within each pixel to estimate the pixel color, leading to slow convergence and the choice between high rendering cost and noisy images. Fortunately, the similarity of neighboring noisy pixels can be exploited to create much cleaner images. Such denoising techniques reduce the rendering budget significantly without sacrificing quality, and are crucial for the application of Monte Carlo rendering to production. One particular reason for the success of Monte Carlo denoisers, and their biggest difference from natural image denoisers, is the flexibility to use sub-pixel information. That is, based on the scene and the application scenario, the renderer can be instructed to output more data than simply the noisy pixel color. The data can contain estimates of a part of light transport or describe properties of the underlying scene. Such additional information can guide the denoiser to better preserve details, remove noise, or serve downstream workflows. In this thesis, we design denoising algorithms for Monte Carlo renderings by applying deep learning techniques in the sub-pixel domain, in the aspects of light transport decomposition, auxiliary feature buffers, and intra-pixel depth separation. Our work mainly targets high-quality offline renderings, and we validate the effectiveness of the methods on both academic and production-quality datasets. First, inspired by user-defined decomposition such as diffuse–specular, we propose to prepend a learned decomposition module to the denoiser, where the learned decomposition typically produces images that are easier to denoise. Results show that this architecture outperforms an end-to-end denoiser with a similar number of trainable parameters, achieving significant rendering cost reduction to reach equal quality. Second, the power of auxiliary feature buffers for denoising prompts us to explore the appropriate feature sets for denoising volumetric effects. Our training–selection–retraining workflow sifts useful features from a large pool of candidates at a relatively low cost. Feature sets produced by this workflow improve denoising quality for denoisers with different architectures on a variety of volumetric effects. Finally, depth separation within each pixel underlies the deep-Z format, which is useful for compositing but lacks an effective denoiser that preserves the depth structure. We propose a neural denoiser for deep-Z images based on 3-D convolutional neural networks, which can effectively remove noise at different depth levels, greatly reducing the rendering cost for deep compositing workflows. The main contributions of the thesis are as follows. For one, we propose novel denoising techniques based on deep learning in the sub-pixel domain or using sub-pixel information, and experimentally show that they advance the state of the art of denoising Monte Carlo renderings. For another, we demonstrate the benefit of using specialized sub-pixel information for more specific types of rendering, such as volumetric effects. Lastly, we show the possibility of generalizing 2-D deeplearning denoising techniques to deep-Z images while preserving the sub-pixel depth structure.