Multimodal Data Fusion
Multimodal data fusion integrates information from diverse sources (e.g., images, text, sensor data) to improve the accuracy, robustness, and interpretability of machine learning models. Current research emphasizes developing efficient fusion architectures, such as attention mechanisms and transformers, and exploring optimal fusion strategies (early, intermediate, or late) depending on the data and task. This field is significantly impacting various applications, from healthcare (e.g., improved diagnostics and personalized medicine) to remote sensing (e.g., enhanced object detection and wildfire prediction), by enabling more comprehensive and reliable analyses than single-modality approaches.
Papers
Unsupervised Multimodal Fusion of In-process Sensor Data for Advanced Manufacturing Process Monitoring
Matthew McKinney, Anthony Garland, Dale Cillessen, Jesse Adamczyk, Dan Bolintineanu, Michael Heiden, Elliott Fowler, Brad L. Boyce
Enhanced Survival Prediction in Head and Neck Cancer Using Convolutional Block Attention and Multimodal Data Fusion
Aiman Farooq, Utkarsh Sharma, Deepak Mishra