Contextual Automatic Speech Recognition
Contextual automatic speech recognition (ASR) aims to improve speech-to-text accuracy by incorporating contextual information, such as user vocabulary or surrounding text, to better recognize rare words and named entities. Current research focuses on enhancing neural network architectures, including transformer transducers and incorporating techniques like contextual biasing, retrieval augmentation, and multi-modal approaches (e.g., using visual information from slides). These advancements are significant because they address limitations of traditional ASR systems, leading to more robust and accurate transcriptions in diverse and challenging real-world scenarios, such as spoken dialog systems and meeting transcription.
Papers
July 16, 2024
July 14, 2024
June 5, 2024
February 2, 2024
January 23, 2024
December 15, 2023
September 18, 2023
September 14, 2023
September 11, 2023
May 30, 2023
May 22, 2023
April 18, 2023
February 26, 2023
July 25, 2022
July 2, 2022