Cross Modal Alignment
Cross-modal alignment focuses on integrating information from different data modalities (e.g., text, images, audio) to create unified representations and uncover correlations between them. Current research emphasizes efficient and robust alignment methods, often employing parameter-efficient fine-tuning, lightweight encoders (like OneEncoder), and novel loss functions to address challenges such as noisy data and modality imbalances. This work is significant for improving the performance of various applications, including visual question answering, image retrieval, and speech recognition, by enabling more accurate and comprehensive understanding of multimodal data.
Papers
February 27, 2023
February 10, 2023
January 26, 2023
January 16, 2023
December 14, 2022
November 26, 2022
November 24, 2022
November 23, 2022
October 24, 2022
October 19, 2022
October 18, 2022
October 13, 2022
October 12, 2022
September 28, 2022
September 23, 2022
August 18, 2022
August 4, 2022
August 3, 2022
July 31, 2022