Open Vocabulary 3D Instance Segmentation

Open-vocabulary 3D instance segmentation aims to identify and segment objects within 3D scenes using free-form text descriptions, going beyond pre-defined object categories. Current research focuses on developing models that effectively integrate 2D and 3D information, often leveraging pre-trained vision-language models and employing techniques like multi-view fusion, mask graph clustering, and dual-path integration to generate accurate and efficient segmentations. This capability is significant for advancing robotics, augmented reality, and other applications requiring flexible and robust 3D scene understanding, particularly in scenarios with unseen or novel objects.

Papers