Speech Recognition Model

Speech recognition models aim to accurately transcribe spoken language into text, driving research into more efficient and robust systems. Current efforts focus on improving model efficiency through techniques like mixture-of-experts, low-rank adaptation, and dynamic layer skipping, often within transformer-based or connectionist temporal classification (CTC) architectures. These advancements are crucial for deploying speech recognition on resource-constrained devices and enhancing performance in diverse acoustic conditions and languages, impacting fields ranging from voice assistants to medical transcription. Furthermore, research emphasizes improving robustness against adversarial attacks and handling challenges like dialect variation and low-resource languages.

Papers