Lip Movement
Lip movement research focuses on understanding and replicating the complex dynamics of lip articulation during speech, primarily aiming to improve speech recognition, synthesis, and human-computer interaction. Current research heavily utilizes deep learning models, including GANs, diffusion models, and transformer networks, often incorporating techniques like disentangled representation learning and audio-visual synchronization to generate realistic lip movements from audio or text input, or to enhance speech recognition in noisy conditions. This field is significant for its applications in assistive technologies for the hearing impaired, realistic avatar creation for virtual and augmented reality, and improved human-computer interfaces.
Papers
Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model
Sanjana Sankar, Martin Lenglet, Gerard Bailly, Denis Beautemps, Thomas Hueber
LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition
Bowen Hao, Dongliang Zhou, Xiaojie Li, Xingyu Zhang, Liang Xie, Jianlong Wu, Erwei Yin