Multimodal Retrieval
Multimodal retrieval focuses on efficiently searching and retrieving information across diverse data types like text, images, and video, aiming to improve the accuracy and relevance of search results. Current research emphasizes developing universal embedding models, often based on transformer architectures and contrastive learning, that can handle various combinations of modalities and tasks, including improving efficiency through generative indexing and refining retrieval with large language models (LLMs). This field is significant for advancing information access across various domains, from improving search engines and embodied AI agents to enabling more effective medical diagnosis and misinformation detection.
Papers
August 25, 2024
August 21, 2024
August 16, 2024
July 19, 2024
July 17, 2024
July 1, 2024
June 11, 2024
June 6, 2024
May 31, 2024
May 30, 2024
May 8, 2024
May 4, 2024
April 19, 2024
March 8, 2024
February 27, 2024
January 11, 2024
December 15, 2023
November 28, 2023
November 14, 2023