Open Vocabulary Semantic Segmentation
Open-vocabulary semantic segmentation (OVSS) aims to assign semantic labels to image pixels without requiring pre-defined categories, enabling the recognition of objects not seen during training. Current research focuses on adapting vision-language models like CLIP, often in conjunction with other foundation models (e.g., SAM, DINO), to achieve this, employing techniques such as multi-resolution processing, pseudo-mask generation, and contrastive learning to improve accuracy and efficiency. OVSS holds significant promise for advancing various applications, including autonomous driving, remote sensing, and medical image analysis, by enabling more flexible and robust image understanding.
Papers
April 9, 2024
March 30, 2024
March 17, 2024
March 6, 2024
February 21, 2024
January 22, 2024
December 19, 2023
December 7, 2023
November 28, 2023
November 27, 2023
November 19, 2023
October 29, 2023
October 8, 2023
September 25, 2023
September 6, 2023
August 31, 2023
August 4, 2023
June 1, 2023
May 23, 2023