Image to Text
Image-to-text research focuses on automatically generating textual descriptions from images, aiming to bridge the gap between visual and linguistic understanding. Current efforts concentrate on improving model accuracy and efficiency using transformer-based architectures, often incorporating techniques like vision grounding and hierarchical processing to better capture spatial relationships and semantic details within images. This field is significant for advancing multimodal AI, with applications ranging from automated image captioning and document understanding to assistive technologies for visually impaired individuals and enhanced accessibility in various digital contexts.
Papers
November 8, 2024
October 25, 2024
October 24, 2024
September 29, 2024
September 20, 2024
August 16, 2024
July 20, 2024
July 11, 2024
May 16, 2024
April 30, 2024
March 5, 2024
January 22, 2024
January 2, 2024
December 27, 2023
November 16, 2023
October 22, 2023
October 12, 2023
October 8, 2023
October 5, 2023
September 25, 2023