End 2 End ASR
End-to-end (E2E) automatic speech recognition (ASR) aims to directly transcribe speech to text without intermediate steps like phoneme recognition, improving efficiency and accuracy. Current research focuses on enhancing E2E ASR's robustness to domain-specific vocabulary (e.g., named entities, technical jargon) through techniques like incorporating contextual information (e.g., descriptions, dialog acts) and leveraging pre-trained language models. These advancements are significant because they improve the accuracy and reliability of speech-to-text systems across diverse applications, from voice assistants to medical transcription.
Papers
July 25, 2024
March 26, 2024
May 29, 2023
May 21, 2023
March 31, 2023
January 17, 2023
November 2, 2022
October 21, 2022
June 29, 2022
May 26, 2022
April 21, 2022