Automatic Speech Recognition
Automatic Speech Recognition (ASR) aims to accurately transcribe spoken language into text, driving research into robust and efficient models. Current efforts focus on improving accuracy and robustness through techniques like consistency regularization in Connectionist Temporal Classification (CTC), leveraging pre-trained multilingual models for low-resource languages, and integrating Large Language Models (LLMs) for enhanced contextual understanding and improved handling of diverse accents and speech disorders. These advancements have significant implications for accessibility, enabling applications in diverse fields such as healthcare, education, and human-computer interaction.
1014papers
Papers - Page 19
March 10, 2024
March 4, 2024
What has LeBenchmark Learnt about French Syntax?
Zdravko Dugonjić, Adrien Pupier, Benjamin Lecouteux, Maximin CoavouxSA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
Zhiyun Fan, Linhao Dong, Jun Zhang, Lu Lu, Zejun MaLanguage and Speech Technology for Central Kurdish Varieties
Sina Ahmadi, Daban Q. Jaff, Md Mahfuz Ibn Alam, Antonios AnastasopoulosJEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition
Chang Sun, Hong Yang, Bo Qin
February 29, 2024
February 28, 2024
February 22, 2024
February 15, 2024
February 12, 2024