ASR Model

Automatic speech recognition (ASR) models aim to accurately transcribe spoken language into text, a task crucial for numerous applications. Current research emphasizes improving model robustness across diverse accents, languages, and noisy environments, often leveraging transformer-based architectures like Wav2Vec 2.0 and Conformer, and incorporating visual information for improved accuracy. Significant efforts focus on addressing biases in ASR models, enhancing efficiency through knowledge distillation and self-supervised learning, and developing methods for low-resource languages. These advancements are driving progress in various fields, including accessibility technologies, human-computer interaction, and language documentation.

Papers

May 26, 2022

Contextual Adapters for Personalized Speech Recognition in Neural Transducers
Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Feng-Ju Chang, Jing Liu, Jinru Su, Grant P. Strimel, Athanasios Mouchtaris, Siegfried Kunzmann
Speech Recognition End to End ASR Model Neural Transducer Rare Word Contextual Adapter End 2 End ASR Context Neural Network

May 18, 2022

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Guangzhi Sun, Chao Zhang, Philip C Woodland
Language Model ASR Model Speech Recognition Error Contextual Asr

May 11, 2022

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao
ASR Model Audio Visual Speech Recognition Visual Speech Recognition

April 18, 2022

Extracting Targeted Training Data from ASR Models, and How to Mitigate It
Ehsan Amid, Om Thakkar, Arun Narayanan, Rajiv Mathews, Françoise Beaufays
Training Data Target Domain ASR Model Noise Masking

March 31, 2022

March 30, 2022

March 29, 2022

Recent improvements of ASR models in the face of adversarial attacks
Raphael Olivier, Bhiksha Raj
Adversarial Attack Native Robustness Recent Advance Human Face ASR Model Speech Recognition Model Attack Algorithm

February 24, 2022

Ask2Mask: Guided Data Selection for Masked Speech Modeling
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro Moreno
Automatic Speech Recognition ASR Model Machine Self Confidence Speech Frame

February 12, 2022

USTED: Improving ASR with a Unified Speech and Text Encoder-Decoder
Bolaji Yusuf, Ankur Gandhe, Alex Sokolov
Automatic Speech Recognition End to End ASR Model End 2 End ASR

January 26, 2022

November 5, 2021

Sequential Randomized Smoothing for Adversarially Robust Speech Recognition
Raphael Olivier, Bhiksha Raj
Adversarial Attack Automatic Speech Recognition Speech Recognition Randomized Smoothing ASR Model Adaptive Attack Imperceptible Attack