End Speech to Text Translation
End-to-end speech-to-text translation aims to directly convert spoken language in one language to written text in another, bypassing the intermediate steps of separate speech recognition and machine translation. Current research focuses on improving model architectures, such as transformer-based networks and connectionist temporal classification models, often employing multi-tasking, consistency regularization, and data augmentation techniques to bridge the modality gap between speech and text and address data scarcity. These advancements hold significant promise for enhancing cross-lingual communication in various applications, including real-time interpretation, automated subtitling, and accessibility tools.
Papers
December 2, 2023
September 27, 2023
August 28, 2023
May 24, 2023
December 7, 2022
July 3, 2022
June 9, 2022
May 5, 2022