Multi Modality
Multimodality in machine learning focuses on integrating information from diverse data sources (e.g., text, images, audio, sensor data) to improve model performance and robustness. Current research emphasizes developing effective fusion strategies within various model architectures, including transformers and autoencoders, often employing contrastive learning and techniques to handle missing modalities. This approach is proving valuable across numerous applications, from medical diagnosis and e-commerce to assistive robotics and urban planning, by enabling more comprehensive and accurate analyses than unimodal methods.
Papers
Advancing Histopathology-Based Breast Cancer Diagnosis: Insights into Multi-Modality and Explainability
Faseela Abdullakutty, Younes Akbari, Somaya Al-Maadeed, Ahmed Bouridane, Rifat Hamoudi
Contextual fusion enhances robustness to image blurring
Shruti Joshi, Aiswarya Akumalla, Seth Haney, Maxim Bazhenov
Attribution Regularization for Multimodal Paradigms
Sahiti Yerramilli, Jayant Sravan Tamarapalli, Jonathan Francis, Eric Nyberg
LastResort at SemEval-2024 Task 3: Exploring Multimodal Emotion Cause Pair Extraction as Sequence Labelling Task
Suyash Vardhan Mathur, Akshett Rai Jindal, Hardik Mittal, Manish Shrivastava
Tell and show: Combining multiple modalities to communicate manipulation tasks to a robot
Petr Vanc, Radoslav Skoviera, Karla Stepanova