Cross Modal Retrieval
Cross-modal retrieval aims to find relevant items across different data types (e.g., images and text, audio and video) by learning shared representations that capture semantic similarities. Current research focuses on improving retrieval accuracy in the face of noisy data, mismatched pairs, and the "modality gap" using techniques like contrastive learning, masked autoencoders, and optimal transport. These advancements are crucial for applications ranging from medical image analysis and robotics to multimedia search and music recommendation, enabling more effective information access and integration across diverse data sources.
Papers
October 17, 2023
September 29, 2023
September 19, 2023
September 16, 2023
September 11, 2023
August 29, 2023
August 24, 2023
August 8, 2023
July 14, 2023
June 1, 2023
May 31, 2023
May 25, 2023
May 9, 2023
May 7, 2023
April 21, 2023
April 20, 2023
April 15, 2023
April 13, 2023