Multimodal Signal
Multimodal signal processing focuses on integrating information from diverse sources, such as audio, video, text, and physiological data, to achieve more robust and comprehensive analyses than using any single modality alone. Current research emphasizes developing models, including transformers, diffusion models, and neural networks like LSTMs, to effectively fuse these heterogeneous data types for tasks ranging from emotion recognition and human-robot interaction to medical image synthesis and automated driving. This field is significant because it enables more accurate and nuanced understanding of complex systems and behaviors, leading to advancements in various applications including healthcare, robotics, and autonomous systems.
Papers
December 13, 2024
November 19, 2024
October 12, 2024
August 29, 2024
June 20, 2024
April 30, 2024
March 16, 2024
March 13, 2024
March 2, 2024
January 5, 2024
November 28, 2023
September 3, 2023
June 24, 2023
June 19, 2023
June 18, 2023
May 24, 2023
May 10, 2023
April 22, 2023
December 22, 2022
December 20, 2022