Local Attention Guided Joint Depth Upsampling

Loading...
Thumbnail Image
Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Image super resolution is a classical computer vision problem. A branch of super resolution tasks deals with guided depth super resolution as objective. Here, the goal is to accurately upsample a given low resolution depth map with the help of features aggregated from the high resolution color image of that particular scene. Recently, the development of transformers has improved performance for general image processing tasks credited to self-attention. Unlike previous methods for guided joint depth upsampling which rely mostly on CNNs, we efficiently compute self-attention with the help of local image attention which avoids the quadratic growth typically found in self-attention layers. Our work combines CNNs and transformers to analyze the two input modalities and employs a cross-modal fusion network in order to predict both a weighted per-pixel filter kernel and a residual for the depth estimation. To further enhance the final output, we integrate a differentiable and a trainable deep guided filtering network which provides an additional depth prior. An ablation study and empirical trials demonstrate the importance of each proposed module. Our method shows comparable as well as state-of-the-art performance on the guided depth upsampling task.
Description

CCS Concepts: Computing methodologies --> Computer vision; Image representations; Reconstruction

        
@inproceedings{
10.2312:vmv.20221197
, booktitle = {
Vision, Modeling, and Visualization
}, editor = {
Bender, Jan
 and
Botsch, Mario
 and
Keim, Daniel A.
}, title = {{
Local Attention Guided Joint Depth Upsampling
}}, author = {
Mallick, Arijit
 and
Engelhardt, Andreas
 and
Braun, Raphael
 and
Lensch, Hendrik P. A.
}, year = {
2022
}, publisher = {
The Eurographics Association
}, ISBN = {
978-3-03868-189-2
}, DOI = {
10.2312/vmv.20221197
} }
Citation
Collections