Multimodal Retrieval
Multimodal retrieval focuses on efficiently searching and retrieving information across diverse data types like text, images, and video, aiming to improve the accuracy and relevance of search results. Current research emphasizes developing universal embedding models, often based on transformer architectures and contrastive learning, that can handle various combinations of modalities and tasks, including improving efficiency through generative indexing and refining retrieval with large language models (LLMs). This field is significant for advancing information access across various domains, from improving search engines and embodied AI agents to enabling more effective medical diagnosis and misinformation detection.
Papers
December 22, 2024
December 19, 2024
December 17, 2024
December 16, 2024
December 14, 2024
December 2, 2024
November 15, 2024
November 13, 2024
November 5, 2024
November 4, 2024
November 2, 2024
October 31, 2024
October 24, 2024
October 20, 2024
October 15, 2024
October 7, 2024
October 4, 2024
September 9, 2024