Open Vocabulary
Open vocabulary research aims to enable artificial intelligence systems to understand and interact with the world using free-form text descriptions, going beyond predefined categories. Current efforts focus on adapting large language and vision-language models (like CLIP and LLMs) to various tasks, including 3D scene understanding, object detection and tracking, and robotic manipulation, often employing architectures such as DETR and transformers. This work is significant because it pushes the boundaries of AI's ability to generalize to unseen objects and situations, with potential impact on autonomous driving, robotics, and other fields requiring robust real-world interaction.
Papers
Prompt-Guided Mask Proposal for Two-Stage Open-Vocabulary Segmentation
Yu-Jhe Li, Xinyang Zhang, Kun Wan, Lantao Yu, Ajinkya Kale, Xin Lu
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians
Siyun Liang, Sen Wang, Kunyi Li, Michael Niemeyer, Stefano Gasperini, Nassir Navab, Federico Tombari