Monolingual Automatic Speech Recognition
Monolingual automatic speech recognition (ASR) focuses on accurately transcribing speech in a single language, aiming to improve accuracy and efficiency compared to multilingual systems. Current research emphasizes refining model architectures like connectionist temporal classification (CTC) and transformers, often incorporating techniques such as k-nearest neighbors (kNN) and gated datastores to enhance performance, particularly in challenging scenarios like code-switching. These advancements are significant for improving the accessibility and usability of speech technology, impacting fields ranging from cultural heritage preservation to efficient transcription services.
Papers
July 24, 2024
June 13, 2024
June 6, 2024
November 5, 2022
November 2, 2022
October 30, 2022
July 7, 2022
December 17, 2021
November 29, 2021