ASR Error Correction

Automatic speech recognition (ASR) error correction aims to improve the accuracy of transcribed speech by post-processing ASR outputs. Current research focuses on leveraging large language models (LLMs) for this task, exploring both tightly coupled systems that integrate LLMs directly into the ASR pipeline and methods that use LLMs for post-processing correction, often incorporating multimodal information like visual cues. These advancements are significant because they promise to reduce word error rates, particularly in challenging scenarios such as low-resource languages or noisy environments, thereby enhancing the performance and reliability of speech-based applications across various fields.

Papers