Automatic Speech Recognition System
Automatic Speech Recognition (ASR) systems aim to accurately convert spoken language into text, a crucial task with broad applications. Current research heavily focuses on improving accuracy and robustness through techniques like end-to-end models (e.g., conformer transducers), large language model integration for error correction and rescoring, and addressing biases in ASR performance across different dialects and languages. These advancements are vital for enhancing the accessibility and usability of speech-based technologies in various fields, from healthcare and assistive technologies to virtual assistants and language documentation.
Papers
Improving Membership Inference in ASR Model Auditing with Perturbed Loss Features
Francisco Teixeira, Karla Pizzi, Raphael Olivier, Alberto Abad, Bhiksha Raj, Isabel Trancoso
Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment
Aditya Chakravarty