Cross Modality Fusion
Cross-modality fusion aims to integrate information from different data sources (e.g., images, text, audio) to improve the performance of machine learning models beyond what's achievable with single modalities. Current research focuses on developing effective fusion strategies, often employing transformer-based architectures, attention mechanisms, and contrastive learning to align and combine features from diverse modalities. This field is significant because improved cross-modality fusion leads to more robust and accurate systems in various applications, including object detection, image retrieval, emotion recognition, and healthcare diagnostics.
Papers
October 21, 2024
October 16, 2024
September 30, 2024
August 31, 2024
May 15, 2024
April 25, 2024
April 14, 2024
March 30, 2024
February 29, 2024
January 17, 2024
January 8, 2024
December 4, 2023
October 23, 2023
October 4, 2023
May 23, 2023
April 6, 2023
January 8, 2023
December 14, 2022
April 12, 2022