Text Grounding
Text grounding focuses on aligning textual descriptions with visual information, aiming to improve the understanding and interpretation of multimodal data. Current research emphasizes improving the accuracy and efficiency of this alignment, exploring techniques like fine-grained image-text alignment, multimodal large language models (MLLMs), and contrastive learning methods to enhance grounding in various applications. This work is significant for advancing multimodal understanding in fields like visual question answering, image captioning, and information retrieval, leading to more robust and explainable AI systems.
Papers
November 6, 2024
October 11, 2024
September 20, 2024
April 10, 2024
March 20, 2024
December 4, 2023
November 22, 2023
November 14, 2023
November 6, 2023
September 20, 2023
July 24, 2023
June 6, 2023
November 28, 2022
November 27, 2022
June 21, 2022
June 17, 2022
May 13, 2022
April 7, 2022
February 23, 2022