Image Understanding
Image understanding research aims to enable computers to interpret and reason about the content of images, mirroring human visual perception and comprehension. Current efforts focus on improving the accuracy and robustness of large multimodal models (like LLMs and VLMs), particularly addressing challenges such as occlusion, cross-domain generalization, and hallucinations, often through techniques like contrastive learning, retrieval augmentation, and self-training. These advancements are crucial for applications ranging from medical image analysis and remote sensing to e-commerce and web accessibility, driving progress in both fundamental computer vision and practical AI systems.
Papers
August 29, 2024
August 1, 2024
July 31, 2024
June 30, 2024
June 18, 2024
June 17, 2024
May 30, 2024
May 24, 2024
May 21, 2024
May 7, 2024
May 5, 2024
April 7, 2024
March 27, 2024
March 22, 2024
March 10, 2024
February 29, 2024
February 13, 2024
January 29, 2024
January 16, 2024