Multimodal Intent
Multimodal intent research focuses on understanding and predicting human actions and intentions by integrating information from multiple sources like vision, language, and physical interaction. Current research emphasizes developing models, often incorporating convolutional neural networks (CNNs) and transformers, to process this multimodal data and predict future actions or behaviors, particularly in human-robot interaction and activity understanding. This work is significant for improving human-computer interaction, enabling more natural and intuitive interactions with robots and AI systems, and advancing our understanding of human behavior in various contexts, such as assistive robotics and autonomous driving.
Papers
March 27, 2024
December 20, 2022
November 24, 2022
October 11, 2022
April 17, 2022