Non Streaming
Non-streaming automatic speech recognition (ASR) models process the entire audio input before generating transcriptions, offering superior accuracy compared to their streaming counterparts which process audio in real-time. Current research focuses on bridging the performance gap between streaming and non-streaming ASR, employing techniques like knowledge distillation to transfer knowledge from non-streaming models to streaming ones, and using contextual biasing and contrastive learning to improve accuracy. These advancements aim to improve the accuracy of real-time speech recognition systems while maintaining low latency, impacting applications such as voice search, virtual assistants, and on-device speech processing.
Papers
July 3, 2024
April 15, 2024
November 15, 2023
August 31, 2023
June 27, 2023
June 1, 2023
January 17, 2023
January 11, 2023
June 26, 2022