Open Vocabulary 3D

Open-vocabulary 3D object detection (OV-3DDet) aims to enable computers to identify and locate 3D objects, even those not seen during training, using diverse data sources like RGB-D images and point clouds. Current research focuses on leveraging pre-trained vision-language models and multi-modal learning techniques, often incorporating strategies like cross-modal alignment and novel object discovery to overcome data scarcity. These advancements are significant for applications in robotics, autonomous navigation, and augmented reality, where robust and adaptable object recognition is crucial.

Papers