Speech Text Alignment
Speech text alignment focuses on precisely mapping the temporal relationship between spoken audio and its corresponding written transcription. Current research emphasizes developing robust models, often employing variational autoencoders (VAEs), transformers, and diffusion models, to achieve accurate alignment even with noisy or imperfect data, leveraging techniques like knowledge distillation and self-supervised learning. Improved alignment is crucial for enhancing various speech processing applications, including automatic speech recognition, text-to-speech synthesis, and multilingual voice processing, leading to more accurate and efficient systems.
Papers
October 24, 2024
September 30, 2024
July 3, 2024
June 27, 2024
June 25, 2024
June 17, 2024
June 16, 2024
June 14, 2024
May 29, 2024
September 26, 2023
May 19, 2023
February 28, 2023
August 22, 2022
March 31, 2022