43-Issue 7

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607047

Browse

Now showing 1 - 6 of 6

Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Zhang, Xindan; Li, Ying; Sheng, Huankun; Zhang, Xinnian; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multimodal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2→SemanticKITTI), Day-to-Night, country-tocountry (USA→Singapore), and synthetic-to-real (VirtualKITTI→SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.
FSH3D: 3D Representation via Fibonacci Spherical Harmonics
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Li, Zikuan; Huang, Anyi; Jia, Wenru; Wu, Qiaoyun; Wei, Mingqiang; Wang, Jun; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Spherical harmonics are a favorable technique for 3D representation, employing a frequency-based approach through the spherical harmonic transform (SHT). Typically, SHT is performed using equiangular sampling grids. However, these grids are non-uniform on spherical surfaces and exhibit local anisotropy, a common limitation in existing spherical harmonic decomposition methods. This paper proposes a 3D representation method using Fibonacci Spherical Harmonics (FSH3D). We introduce a spherical Fibonacci grid (SFG), which is more uniform than equiangular grids for SHT in the frequency domain. Our method employs analytical weights for SHT on SFG, effectively assigning sampling errors to spherical harmonic degrees higher than the recovered band-limited function. This provides a novel solution for spherical harmonic transformation on non-equiangular grids. The key advantages of our FSH3D method include: 1) With the same number of sampling points, SFG captures more features without bias compared to equiangular grids; 2) The root mean square error of 32-degree spherical harmonic coefficients is reduced by approximately 34.6% for SFG compared to equiangular grids; and 3) FSH3D offers more stable frequency domain representations, especially for rotating functions. FSH3D enhances the stability of frequency domain representations under rotational transformations. Its application in 3D shape reconstruction and 3D shape classification results in more accurate and robust representations. Our code is publicly available at https://github.com/Miraclelzk/Fibonacci-Spherical-Harmonics.
GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Sun, Yanhao; Tian, Runze; Han, Xiao; Liu, Xinyao; Zhang, Yan; Xu, Kai; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
With the emergence of large-scale Text-to-Image(T2I) models and implicit 3D representations like Neural Radiance Fields (NeRF), many text-driven generative editing methods based on NeRF have appeared. However, the implicit encoding of geometric and textural information poses challenges in accurately locating and controlling objects during editing. Recently, significant advancements have been made in the editing methods of 3D Gaussian Splatting, a real-time rendering technology that relies on explicit representation. However, these methods still suffer from issues including inaccurate localization and limited manipulation over editing. To tackle these challenges, we propose GSEditPro, a novel 3D scene editing framework which allows users to perform various creative and precise editing using text prompts only. Leveraging the explicit nature of the 3D Gaussian distribution, we introduce an attention-based progressive localization module to add semantic labels to each Gaussian during rendering. This enables precise localization on editing areas by classifying Gaussians based on their relevance to the editing prompts derived from cross-attention layers of the T2I model. Furthermore, we present an innovative editing optimization method based on 3D Gaussian Splatting, obtaining stable and refined editing results through the guidance of Score Distillation Sampling and pseudo ground truth. We prove the efficacy of our method through extensive experiments.
LGSur-Net: A Local Gaussian Surface Representation Network for Upsampling Highly Sparse Point Cloud
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Xiao, Zijian; Zhou, Tianchen; Yao, Li; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
We introduce LGSur-Net, an end-to-end deep learning architecture, engineered for the upsampling of sparse point clouds. LGSur-Net harnesses a trainable Gaussian local representation by positioning a series of Gaussian functions on an oriented plane, complemented by the optimization of individual covariance matrices. The integration of parametric factors allows for the encoding of the plane's rotational dynamics and Gaussian weightings into a linear transformation matrix. Then we extract the feature maps from the point cloud and its adjoining edges and learn the local Gaussian depictions to accurately model the shape's local geometry through an attention-based network. The Gaussian representation's inherent high-order continuity endows LGSur-Net with the natural ability to predict surface normals and support upsampling to any specified resolution. Comprehensive experiments validate that LGSur-Net efficiently learns from sparse data inputs, surpassing the performance of existing state-of-the-art upsampling methods. Our code is publicly available at https://github.com/Rangiant5b72/LGSur-Net.
LightUrban: Similarity Based Fine-grained Instancing for Lightweighting Complex Urban Point Clouds
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Lu, Zi Ang; Xiong, Wei Dan; Ren, Peng; Jia, Jin Yuan; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Large-scale urban point clouds play a vital role in various applications, while rendering and transmitting such data remains challenging due to its large volume, complicated structures, and significant redundancy. In this paper, we present LightUrban, the first point cloud instancing framework for efficient rendering and transmission of fine-grained complex urban scenes.We first introduce a segmentation method to organize the point clouds into individual buildings and vegetation instances from coarse to fine. Next, we propose an unsupervised similarity detection approach to accurately group instances with similar shapes. Furthermore, a fast pose and size estimation component is applied to calculate the transformations between the representative instance and the corresponding similar instances in each group. By replacing individual instances with their group's representative instances, the data volume and redundancy can be dramatically reduced. Experimental results on large-scale urban scenes demonstrate the effectiveness of our algorithm. To sum up, our method not only structures the urban point clouds but also significantly reduces data volume and redundancy, filling the gap in lightweighting urban landscapes through instancing.
PCLC-Net: Point Cloud Completion in Arbitrary Poses using Learnable Canonical Space
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Xu, Hanmo; Shuai, Qingyao; Chen, Xuejin; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
Recovering the complete structure from partial point clouds in arbitrary poses is challenging. Recently, many efforts have been made to address this problem by developing SO(3)-equivariant completion networks or aligning the partial point clouds with a predefined canonical space before completion. However, these approaches are limited to random rotations only or demand costly pose annotation for model training. In this paper, we present a novel Network for Point cloud Completion with Learnable Canonical space (PCLC-Net) to reduce the need for pose annotations and extract SE(3)-invariant geometry features to improve the completion quality in arbitrary poses. Without pose annotations, our PCLC-Net utilizes self-supervised pose estimation to align the input partial point clouds to a canonical space that is learnable for an object category and subsequently performs shape completion in the learned canonical space. Our PCLC-Net can complete partial point clouds with arbitrary SE(3) poses without requiring pose annotations for supervision. Our PCLC-Net achieves state-of-the-art results on shape completion with arbitrary SE(3) poses on both synthetic and real scanned data. To the best of our knowledge, our method is the first to achieve shape completion in arbitrary poses without pose annotations during network training.

Browse

Browsing 43-Issue 7 by Subject "based models"

Results Per Page

Sort Options