Multi Modal Representation
Multi-modal representation learning aims to create unified representations from diverse data types (e.g., images, text, audio) to improve downstream tasks like object recognition, medical diagnosis, and recommendation systems. Current research focuses on developing effective fusion techniques, often employing transformer architectures, contrastive learning, and graph-based methods to align and integrate information across modalities, addressing challenges like modality gaps and imbalanced contributions. These advancements are significantly impacting various fields by enabling more robust and accurate analyses of complex data, leading to improved performance in applications ranging from healthcare to engineering design.
Papers
May 25, 2023
May 13, 2023
May 6, 2023
April 11, 2023
April 5, 2023
April 4, 2023
April 2, 2023
March 15, 2023
March 6, 2023
February 14, 2023
November 24, 2022
October 19, 2022
October 17, 2022
August 23, 2022
June 21, 2022
June 1, 2022
March 7, 2022
January 15, 2022
January 11, 2022