Object Recognition
Object recognition, a core task in computer vision, aims to automatically identify and locate objects within images or videos. Current research emphasizes improving the accuracy and efficiency of object recognition across diverse conditions, including low-light, occlusion, and unseen object categories, often leveraging vision-language models (VLMs), convolutional neural networks (CNNs), and transformer architectures. This field is crucial for advancing robotics, autonomous systems, assistive technologies for visually impaired individuals, and various other applications requiring robust scene understanding. Ongoing efforts focus on mitigating annotation errors, enhancing model explainability, and developing more efficient and robust algorithms for real-time performance.
Papers
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey
Jingcai Guo, Zhijie Rao, Zhi Chen, Song Guo, Jingren Zhou, Dacheng Tao
UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling
Haider Al-Tahan, Quentin Garrido, Randall Balestriero, Diane Bouchacourt, Caner Hazirbas, Mark Ibrahim