Multilingual Automatic Speech Recognition Model

Multilingual automatic speech recognition (ASR) aims to build models capable of accurately transcribing speech across multiple languages, addressing the challenge of limited resources for many languages. Current research focuses on improving model configurability and robustness, often employing techniques like weighted cross-entropy for low-resource languages, knowledge distillation for efficiency, and adaptive masking for model compression. These advancements are crucial for broadening access to speech technology globally and improving the accuracy and efficiency of multilingual human-computer interaction.

Papers