Open Vocabulary Detection Benchmark
Open-vocabulary object detection (OVOD) benchmarks evaluate the ability of computer vision models to identify objects using textual descriptions, even for classes unseen during training. Current research focuses on improving the accuracy and efficiency of OVOD models, often employing vision-language models and exploring techniques like contrastive learning, pseudo-labeling, and language-aware fusion to enhance performance on challenging datasets like LVIS and COCO. These benchmarks are crucial for advancing the field of open-vocabulary object detection, driving the development of more robust and generalizable visual recognition systems with applications in various domains.
Papers
July 10, 2024
November 29, 2023
September 29, 2023
June 8, 2023
May 11, 2023
November 29, 2022