Gesture Generation

Gesture generation focuses on creating realistic and contextually appropriate movements to accompany speech or text, primarily for virtual agents and robots to enhance human-computer interaction. Current research heavily utilizes deep learning models, particularly diffusion models and transformers, often incorporating multimodal data (audio, text, video) to improve the naturalness and semantic coherence of generated gestures. This field is significant for advancing human-robot interaction, virtual character animation, and accessibility technologies by enabling more natural and expressive communication.

Papers