Modal Similarity
Modal similarity research focuses on developing methods to effectively compare and integrate information from different data modalities (e.g., text and images, audio and video). Current research emphasizes improving cross-modal alignment through techniques like contrastive learning and attention mechanisms, often leveraging pre-trained models such as CLIP, and exploring multi-scale and fine-grained similarity measures to capture nuanced relationships. This work is crucial for advancing applications in diverse fields, including image captioning, semantic location prediction, and multimodal retrieval, by enabling more accurate and robust information fusion across various data types.
Papers
October 20, 2024
September 17, 2024
July 2, 2024
May 9, 2024
April 17, 2024
March 30, 2024
March 15, 2024
January 29, 2024
December 31, 2023
October 9, 2023
September 20, 2023
May 9, 2023
May 8, 2023
April 11, 2023
March 30, 2023
March 20, 2023
December 14, 2022
October 28, 2022
October 19, 2022