Image to Text
Image-to-text research focuses on automatically generating textual descriptions from images, aiming to bridge the gap between visual and linguistic understanding. Current efforts concentrate on improving model accuracy and efficiency using transformer-based architectures, often incorporating techniques like vision grounding and hierarchical processing to better capture spatial relationships and semantic details within images. This field is significant for advancing multimodal AI, with applications ranging from automated image captioning and document understanding to assistive technologies for visually impaired individuals and enhanced accessibility in various digital contexts.
Papers
September 25, 2023
September 8, 2023
August 10, 2023
August 8, 2023
July 13, 2023
July 11, 2023
July 7, 2023
June 29, 2023
June 13, 2023
June 7, 2023
May 4, 2023
April 28, 2023
March 21, 2023
March 10, 2023
December 23, 2022
October 20, 2022
October 19, 2022
October 7, 2022
January 14, 2022