Factored Neural Representation for Scene Understanding

dc.contributor.authorWong, Yu-Shiangen_US
dc.contributor.authorMitra, Niloy J.en_US
dc.contributor.editorMemari, Pooranen_US
dc.contributor.editorSolomon, Justinen_US
dc.date.accessioned2023-06-30T06:19:13Z
dc.date.available2023-06-30T06:19:13Z
dc.date.issued2023
dc.description.abstractA long-standing goal in scene understanding is to obtain interpretable and editable representations that can be directly constructed from a raw monocular RGB-D video, without requiring specialized hardware setup or priors. The problem is significantly more challenging in the presence of multiple moving and/or deforming objects. Traditional methods have approached the setup with a mix of simplifications, scene priors, pretrained templates, or known deformation models. The advent of neural representations, especially neural implicit representations and radiance fields, opens the possibility of end-to-end optimization to collectively capture geometry, appearance, and object motion. However, current approaches produce global scene encoding, assume multiview capture with limited or no motion in the scenes, and do not facilitate easy manipulation beyond novel view synthesis. In this work, we introduce a factored neural scene representation that can directly be learned from a monocular RGB-D video to produce object-level neural presentations with an explicit encoding of object movement (e.g., rigid trajectory) and/or deformations (e.g., nonrigid movement). We evaluate ours against a set of neural approaches on both synthetic and real data to demonstrate that the representation is efficient, interpretable, and editable (e.g., change object trajectory). Code and data are available at: http://geometry.cs.ucl.ac.uk/projects/2023/factorednerf/.en_US
dc.description.number5
dc.description.sectionheadersPoint Clouds and Scenes
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume42
dc.identifier.doi10.1111/cgf.14911
dc.identifier.issn1467-8659
dc.identifier.pages14 pages
dc.identifier.urihttps://doi.org/10.1111/cgf.14911
dc.identifier.urihttps://diglib.eg.org:443/handle/10.1111/cgf14911
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectCCS Concepts: Computing methodologies -> Reconstruction; Volumetric models; Tracking
dc.subjectComputing methodologies
dc.subjectReconstruction
dc.subjectVolumetric models
dc.subjectTracking
dc.titleFactored Neural Representation for Scene Understandingen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
v42i5_15_14911.pdf
Size:
42.23 MB
Format:
Adobe Portable Document Format
Collections