Image Captioning
Image captioning aims to automatically generate descriptive text for images, bridging the gap between computer vision and natural language processing. Current research focuses on improving efficiency (e.g., through early exits and knowledge distillation), enhancing performance on fine-grained datasets (e.g., by incorporating object-part details), and developing more robust evaluation metrics (e.g., addressing hallucinations). These advancements are significant for applications ranging from assisting visually impaired individuals to improving image search and retrieval, and are driving innovation in both vision-language models and evaluation methodologies.
Papers
November 18, 2024
November 9, 2024
October 26, 2024
October 22, 2024
October 8, 2024
October 6, 2024
September 30, 2024
September 28, 2024
September 26, 2024
September 23, 2024
September 19, 2024
September 17, 2024
September 5, 2024
August 29, 2024
August 28, 2024
August 26, 2024
August 25, 2024
August 12, 2024