Transformer Based Automatic Speech Recognition
Transformer-based automatic speech recognition (ASR) aims to improve the accuracy and efficiency of converting spoken language into text using the transformer neural network architecture. Current research focuses on optimizing model architectures like Conformers and Transformer Transducers, incorporating contextual information for improved rare word recognition, and developing efficient decoding strategies to reduce latency and energy consumption for on-device applications. These advancements are significant because they promise more accurate, faster, and resource-efficient speech recognition systems, impacting fields ranging from virtual assistants to accessibility technologies.
Papers
December 21, 2021