Open Vocabulary Panoptic Segmentation
Open-vocabulary panoptic segmentation aims to automatically segment images into both semantic regions ("stuff") and individual objects ("things"), even those unseen during training, using textual descriptions. Current research focuses on improving mask classification accuracy through techniques like multimodal attention mechanisms and vision-language model fine-tuning, often leveraging pre-trained models such as CLIP and SAM. This rapidly advancing field is crucial for robust scene understanding in robotics, autonomous driving, and other applications requiring accurate and comprehensive image interpretation beyond predefined object categories.
Papers
September 24, 2024
July 15, 2024
April 2, 2024
March 14, 2024
January 4, 2024
September 11, 2023
September 8, 2023
March 20, 2023
March 8, 2023