Multimodal Retrieval
Multimodal retrieval focuses on efficiently searching and retrieving information across diverse data types like text, images, and video, aiming to improve the accuracy and relevance of search results. Current research emphasizes developing universal embedding models, often based on transformer architectures and contrastive learning, that can handle various combinations of modalities and tasks, including improving efficiency through generative indexing and refining retrieval with large language models (LLMs). This field is significant for advancing information access across various domains, from improving search engines and embodied AI agents to enabling more effective medical diagnosis and misinformation detection.
Papers
November 15, 2024
November 13, 2024
November 5, 2024
November 4, 2024
November 2, 2024
October 31, 2024
October 24, 2024
October 20, 2024
October 15, 2024
October 7, 2024
October 4, 2024
September 9, 2024
August 27, 2024
August 25, 2024
August 21, 2024
August 16, 2024
July 19, 2024
July 17, 2024
July 1, 2024