Speech Input
Speech input processing focuses on enabling computers to understand and respond to human speech, aiming to bridge the gap between human communication and machine interaction. Current research emphasizes improving the robustness and accuracy of speech recognition across diverse accents, noise levels, and speaking styles, often employing large language models (LLMs) and deep learning architectures like transformers and convolutional recurrent networks. This field is crucial for advancing human-computer interaction, impacting applications ranging from virtual assistants and accessibility tools to more sophisticated multimodal systems capable of understanding both speech and visual information.
Papers
July 18, 2022
June 17, 2022
April 12, 2022
March 16, 2022
February 26, 2022
February 16, 2022
January 4, 2022
December 27, 2021