End to End Speech Recognition
End-to-end speech recognition aims to directly transcribe speech into text without the intermediate steps of traditional hybrid systems, improving efficiency and potentially accuracy. Current research focuses on addressing limitations such as robustness to noise and unseen words, often employing transformer-based architectures, connectionist temporal classification (CTC), and techniques like data augmentation and speaker adaptation to enhance performance. These advancements are significant for improving the accuracy and applicability of speech recognition across diverse accents, languages, and noisy environments, impacting fields ranging from voice assistants to healthcare applications.
Papers
June 21, 2024
January 5, 2024
October 16, 2023
July 5, 2023
March 22, 2023
November 16, 2022
October 21, 2022
July 14, 2022
April 3, 2022
March 2, 2022