Open Vocabulary Semantic Segmentation
Open-vocabulary semantic segmentation (OVSS) aims to assign semantic labels to image pixels without requiring pre-defined categories, enabling the recognition of objects not seen during training. Current research focuses on adapting vision-language models like CLIP, often in conjunction with other foundation models (e.g., SAM, DINO), to achieve this, employing techniques such as multi-resolution processing, pseudo-mask generation, and contrastive learning to improve accuracy and efficiency. OVSS holds significant promise for advancing various applications, including autonomous driving, remote sensing, and medical image analysis, by enabling more flexible and robust image understanding.
Papers
December 31, 2024
December 20, 2024
December 18, 2024
December 12, 2024
November 26, 2024
November 24, 2024
November 22, 2024
November 21, 2024
November 18, 2024
November 15, 2024
October 15, 2024
October 9, 2024
October 2, 2024
September 30, 2024
September 21, 2024
August 27, 2024
August 18, 2024
August 9, 2024
August 1, 2024