Speech to Text
Speech-to-text (STT) research aims to accurately and efficiently convert spoken language into written text, encompassing tasks like automatic speech recognition and speech translation. Current efforts focus on improving model robustness and accuracy, particularly for low-resource languages and challenging audio conditions, often leveraging large language models (LLMs) and transformer-based architectures like Whisper and Conformer, alongside techniques like data augmentation and transfer learning. These advancements have significant implications for accessibility, enabling improved human-computer interaction and facilitating the development of more inclusive and versatile applications across various fields.
Papers
May 2, 2022
April 8, 2022
April 7, 2022
February 21, 2022
December 27, 2021
November 19, 2021