Multilingual Automatic Speech Recognition

Multilingual Automatic Speech Recognition (MASR) aims to build systems capable of accurately transcribing speech across multiple languages, overcoming the limitations of monolingual models. Current research focuses on improving accuracy, particularly for low-resource languages, through techniques like weighted cross-entropy loss functions, N-best re-ranking, and efficient adapter modules within architectures such as Conformers and Whisper. These advancements are crucial for bridging language barriers in various applications, from healthcare to global communication, and are driving significant progress in both the theoretical understanding and practical deployment of speech technology.

Papers