Multi Modal Data
Multi-modal data analysis focuses on integrating information from diverse sources, such as images, text, audio, and sensor data, to achieve more comprehensive and accurate insights than using any single modality alone. Current research emphasizes developing robust models, often based on transformer architectures and contrastive learning, that can effectively fuse these disparate data types, handle missing data, and address issues like noisy labels and modality mismatches. This field is crucial for advancing numerous applications, including medical diagnosis, urban planning, materials science, and traffic prediction, by enabling more sophisticated and reliable analyses of complex systems.
Papers
Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data
Zhuohong Li, Wei He, Jiepan Li, Hongyan Zhang
xMTrans: Temporal Attentive Cross-Modality Fusion Transformer for Long-Term Traffic Prediction
Huy Quang Ung, Hao Niu, Minh-Son Dao, Shinya Wada, Atsunori Minamikawa