Open Vocabulary Object Detection
Open-vocabulary object detection (OVOD) aims to enable computer vision systems to identify objects using textual descriptions, even if those objects weren't seen during training. Current research focuses on improving the accuracy and efficiency of OVOD, often leveraging vision-language models (like CLIP) and transformer-based architectures (like DETR) to bridge the gap between visual and textual representations, and addressing challenges like fine-grained attribute recognition and robustness to distribution shifts. The advancements in OVOD have significant implications for various applications, including autonomous driving, robotics, and remote sensing, by enabling more flexible and adaptable object recognition capabilities.
Papers
November 5, 2024
October 27, 2024
October 23, 2024
October 20, 2024
October 11, 2024
October 9, 2024
September 24, 2024
September 19, 2024
September 13, 2024
August 20, 2024
August 17, 2024
August 7, 2024
July 31, 2024
July 16, 2024
July 15, 2024
July 12, 2024
June 17, 2024
June 13, 2024