Language Model Rescoring

Language model rescoring enhances automatic speech recognition (ASR) by using large language models (LLMs) to re-evaluate the initial ASR output, improving accuracy. Current research focuses on leveraging various LLMs, including instruction-tuned and multi-modal models, often employing techniques like lattice rescoring or n-best list reranking to refine transcriptions. This approach yields significant word error rate reductions across diverse ASR tasks and datasets, impacting the development of more accurate and robust speech technologies for applications like virtual assistants and voice search.

Papers