Automatic Speech Recognition System

Automatic Speech Recognition (ASR) systems aim to accurately convert spoken language into text, a crucial task with broad applications. Current research heavily focuses on improving accuracy and robustness through techniques like end-to-end models (e.g., conformer transducers), large language model integration for error correction and rescoring, and addressing biases in ASR performance across different dialects and languages. These advancements are vital for enhancing the accessibility and usability of speech-based technologies in various fields, from healthcare and assistive technologies to virtual assistants and language documentation.

Papers

December 16, 2021

Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff
Automatic Speech Recognition Domain Specific Transformer Based Language Model Automatic Speech Recognition System Efficient Domain Adaptation

December 14, 2021

Robustifying automatic speech recognition by extracting slowly varying features
Matías Pizarro, Dorothea Kolossa, Asja Fischer
Adversarial Attack Automatic Speech Recognition Adversarial Example Feature Wise Automatic Speech Recognition System Hybrid Automatic Speech Recognition

December 10, 2021

Sequence-level self-learning with multiple hypotheses
Kenichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng
Automatic Speech Recognition Self Supervised Speech Data Automatic Speech Recognition System Attention Based Sequence Depth Hypothesis

December 2, 2021

A higher order Minkowski loss for improved prediction ability of acoustic model in ASR
Vishwanath Pratap Singh, Shakti P. Rath, Abhishek Pandey
Automatic Speech Recognition Loss Function Automatic Speech Recognition System Acoustic Model Improving Prediction Higher Order Loss

Automatic Speech Recognition System

Papers

Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems

Robustifying automatic speech recognition by extracting slowly varying features

Sequence-level self-learning with multiple hypotheses

A higher order Minkowski loss for improved prediction ability of acoustic model in ASR