Explainable Multimodal

Explainable multimodal learning aims to build AI systems that can effectively integrate and interpret information from diverse data sources (e.g., text, images, sensor data) while providing transparent insights into their decision-making processes. Current research emphasizes developing novel architectures, such as attention mechanisms and concept-based approaches, to improve the integration of different modalities and enhance model explainability through techniques like Grad-CAM and concept mapping. This field is crucial for building trustworthy AI systems in high-stakes applications like healthcare and crisis response, where understanding the reasoning behind predictions is paramount for both user acceptance and responsible deployment.

Papers