Multimodal Search
Multimodal search aims to improve information retrieval by allowing users to query databases using multiple input modalities, such as text and images, mirroring how humans naturally search for information. Current research focuses on leveraging large multimodal models (LLMs) and retrieval-augmented generation (RAG) techniques, often incorporating specialized agents for tasks like query understanding and result summarization, to enhance search accuracy and personalization. This field is significant because it promises to revolutionize information access across diverse domains, from e-commerce and cultural heritage management to video and image retrieval, by enabling more intuitive and effective search experiences.
Papers
November 25, 2024
November 19, 2024
October 25, 2024
September 19, 2024
September 1, 2024
August 26, 2024
July 9, 2024
June 17, 2024
June 11, 2024
May 2, 2024
April 24, 2024
April 16, 2024
September 14, 2023