Cross Modal Retrieval
Cross-modal retrieval aims to find relevant items across different data types (e.g., images and text, audio and video) by learning shared representations that capture semantic similarities. Current research focuses on improving retrieval accuracy in the face of noisy data, mismatched pairs, and the "modality gap" using techniques like contrastive learning, masked autoencoders, and optimal transport. These advancements are crucial for applications ranging from medical image analysis and robotics to multimedia search and music recommendation, enabling more effective information access and integration across diverse data sources.
Papers
September 26, 2022
August 26, 2022
August 22, 2022
August 7, 2022
July 29, 2022
July 11, 2022
July 2, 2022
May 24, 2022
May 6, 2022
May 5, 2022
April 20, 2022
April 19, 2022
April 18, 2022
April 15, 2022
March 31, 2022
March 29, 2022
February 23, 2022