3D Semantic Understanding

3D semantic understanding aims to create comprehensive, computer-interpretable models of 3D scenes that include both geometric information and semantic labels for objects and surfaces. Current research focuses on leveraging deep learning, particularly neural radiance fields (NeRFs) and Gaussian splatting, often combined with large language and vision models like CLIP and SAM, to achieve real-time, view-consistent 3D scene reconstruction and semantic segmentation from various input modalities, including single images and panoramic views. This field is crucial for advancing applications in robotics, autonomous driving, augmented reality, and Earth observation, by enabling more robust and intelligent interaction with the 3D world.

Papers