Open Vocabulary Object Detection
Open-vocabulary object detection (OVOD) aims to enable computer vision systems to identify objects using textual descriptions, even if those objects weren't seen during training. Current research focuses on improving the accuracy and efficiency of OVOD, often leveraging vision-language models (like CLIP) and transformer-based architectures (like DETR) to bridge the gap between visual and textual representations, and addressing challenges like fine-grained attribute recognition and robustness to distribution shifts. The advancements in OVOD have significant implications for various applications, including autonomous driving, robotics, and remote sensing, by enabling more flexible and adaptable object recognition capabilities.
Papers
August 11, 2023
July 24, 2023
July 7, 2023
June 16, 2023
June 8, 2023
May 11, 2023
April 10, 2023
April 7, 2023
March 25, 2023
March 23, 2023
March 10, 2023
February 27, 2023
January 23, 2023
December 23, 2022
November 27, 2022
November 4, 2022
November 2, 2022
September 30, 2022
July 18, 2022
June 22, 2022