Speech Text
Speech-text research focuses on developing models that effectively bridge the gap between spoken and written language, aiming for improved understanding and generation of both modalities. Current efforts concentrate on joint pre-training of speech and text using encoder-decoder architectures and multi-task learning, often incorporating self-supervised tasks to leverage unlabeled data and improve cross-modal alignment. These advancements are significantly impacting automatic speech recognition, speech translation, and text-to-speech synthesis, leading to more accurate and natural-sounding systems with improved performance even in low-resource scenarios.
Papers
March 1, 2024
February 8, 2024
October 9, 2023
May 19, 2023
November 29, 2022
November 7, 2022
October 30, 2022
October 27, 2022
October 7, 2022
April 11, 2022