Non Autoregressive Automatic Speech Recognition

Non-autoregressive (NAR) automatic speech recognition aims to achieve the speed of parallel processing while maintaining accuracy comparable to traditional autoregressive methods. Current research focuses on improving the accuracy of NAR models, often employing transformer-based architectures, connectionist temporal classification (CTC), and techniques like mask-CTC and incorporating language models for improved performance. These advancements offer significant potential for faster and more efficient speech recognition systems, impacting applications ranging from real-time transcription to voice-controlled devices.

Papers