Cross Modal Retrieval
Cross-modal retrieval aims to find relevant items across different data types (e.g., images and text, audio and video) by learning shared representations that capture semantic similarities. Current research focuses on improving retrieval accuracy in the face of noisy data, mismatched pairs, and the "modality gap" using techniques like contrastive learning, masked autoencoders, and optimal transport. These advancements are crucial for applications ranging from medical image analysis and robotics to multimedia search and music recommendation, enabling more effective information access and integration across diverse data sources.
Papers
May 7, 2024
May 2, 2024
April 21, 2024
April 15, 2024
March 26, 2024
March 20, 2024
March 15, 2024
March 8, 2024
February 16, 2024
February 15, 2024
February 9, 2024
January 29, 2024
January 11, 2024
January 10, 2024
December 27, 2023
December 26, 2023
December 14, 2023
November 8, 2023
October 20, 2023