Image Understanding
Image understanding research aims to enable computers to interpret and reason about the content of images, mirroring human visual perception and comprehension. Current efforts focus on improving the accuracy and robustness of large multimodal models (like LLMs and VLMs), particularly addressing challenges such as occlusion, cross-domain generalization, and hallucinations, often through techniques like contrastive learning, retrieval augmentation, and self-training. These advancements are crucial for applications ranging from medical image analysis and remote sensing to e-commerce and web accessibility, driving progress in both fundamental computer vision and practical AI systems.
Papers
November 15, 2024
November 12, 2024
November 7, 2024
November 6, 2024
October 25, 2024
October 16, 2024
October 15, 2024
September 23, 2024
September 20, 2024
September 19, 2024
September 14, 2024
September 9, 2024
September 3, 2024
August 29, 2024
August 1, 2024
July 31, 2024
June 30, 2024
June 18, 2024
June 17, 2024