Visual Speech
Visual speech research focuses on understanding and utilizing the visual aspects of spoken language, aiming to improve speech recognition and translation, particularly in noisy environments or for individuals with hearing impairments. Current research employs deep learning models, including transformers and autoencoders, often incorporating self-supervised learning and multimodal fusion techniques to integrate audio and visual information effectively. This field is significant for its potential to enhance human-computer interaction, improve accessibility for the hearing impaired, and advance applications such as speech-to-speech translation and video dubbing.
Papers
November 8, 2024
October 21, 2024
August 1, 2024
February 23, 2024
December 14, 2023
October 11, 2023
September 29, 2023
August 15, 2023
May 24, 2023
May 5, 2023
March 9, 2023
February 22, 2023
December 8, 2022
November 20, 2022
October 10, 2022
July 22, 2022
July 14, 2022
May 22, 2022
March 18, 2022