Multisensory AI

Multisensory AI focuses on creating artificial intelligence systems that can process and integrate information from multiple sensory modalities, such as vision, audio, and text, to achieve more robust and comprehensive understanding. Current research emphasizes developing foundational models, like multimodal transformers, and efficient architectures inspired by biological neural networks, aiming for improved energy efficiency and resilience. These advancements hold significant promise for applications in diverse fields, including healthcare (e.g., mental health diagnosis), multimedia processing, and robotics, by enabling more sophisticated and contextually aware AI systems.

Papers