Audio Visual Speech
Audio-visual speech research focuses on leveraging the combined information from audio and visual speech signals to improve speech processing tasks. Current research emphasizes direct audio-visual to audio-visual translation, employing models that learn unified audio-visual representations through self-supervised learning and transformer-based architectures to achieve real-time, high-fidelity translation and robust speech recognition even in noisy conditions. This interdisciplinary field is significant for advancing speech technology, enabling improved speech recognition, translation, and enhancement, with applications ranging from virtual meetings to assistive technologies for the hearing impaired.
Papers
December 23, 2023
December 5, 2023
September 29, 2023
June 1, 2023
March 1, 2023
May 15, 2022
March 31, 2022