Multi Modal Representation
Multi-modal representation learning aims to create unified representations from diverse data types (e.g., images, text, audio) to improve downstream tasks like object recognition, medical diagnosis, and recommendation systems. Current research focuses on developing effective fusion techniques, often employing transformer architectures, contrastive learning, and graph-based methods to align and integrate information across modalities, addressing challenges like modality gaps and imbalanced contributions. These advancements are significantly impacting various fields by enabling more robust and accurate analyses of complex data, leading to improved performance in applications ranging from healthcare to engineering design.
Papers
January 2, 2025
November 18, 2024
October 16, 2024
October 7, 2024
October 3, 2024
September 9, 2024
September 1, 2024
July 30, 2024
July 8, 2024
June 17, 2024
June 6, 2024
April 18, 2024
April 11, 2024
March 25, 2024
March 12, 2024
January 31, 2024
October 20, 2023
August 14, 2023
July 20, 2023