Multi Modal Learning
Multi-modal learning aims to improve machine learning performance by integrating information from diverse data sources like images, text, and audio. Current research emphasizes developing robust methods for aligning and fusing these modalities, often employing techniques like contrastive learning, latent variable models, and attention mechanisms within various architectures including transformers and generative models. This field is significant because it enables more accurate and comprehensive analyses across numerous domains, from medical diagnosis (e.g., using images and genomic data) to action recognition (e.g., combining RGB and skeletal data), improving both scientific understanding and practical applications.
Papers
June 20, 2024
June 12, 2024
June 4, 2024
May 27, 2024
May 25, 2024
May 15, 2024
May 12, 2024
April 18, 2024
April 12, 2024
April 11, 2024
April 4, 2024
March 13, 2024
March 7, 2024
March 3, 2024
March 1, 2024
February 28, 2024
February 25, 2024
February 24, 2024