Open Vocabulary Object Detection
Open-vocabulary object detection (OVOD) aims to enable computer vision systems to identify objects using textual descriptions, even if those objects weren't seen during training. Current research focuses on improving the accuracy and efficiency of OVOD, often leveraging vision-language models (like CLIP) and transformer-based architectures (like DETR) to bridge the gap between visual and textual representations, and addressing challenges like fine-grained attribute recognition and robustness to distribution shifts. The advancements in OVOD have significant implications for various applications, including autonomous driving, robotics, and remote sensing, by enabling more flexible and adaptable object recognition capabilities.
Papers
July 12, 2024
June 17, 2024
June 13, 2024
June 10, 2024
June 1, 2024
May 30, 2024
May 16, 2024
May 14, 2024
April 18, 2024
April 14, 2024
April 12, 2024
April 8, 2024
April 4, 2024
April 1, 2024
March 19, 2024
March 15, 2024
March 14, 2024
March 11, 2024
February 7, 2024