Speech Input
Speech input processing focuses on enabling computers to understand and respond to human speech, aiming to bridge the gap between human communication and machine interaction. Current research emphasizes improving the robustness and accuracy of speech recognition across diverse accents, noise levels, and speaking styles, often employing large language models (LLMs) and deep learning architectures like transformers and convolutional recurrent networks. This field is crucial for advancing human-computer interaction, impacting applications ranging from virtual assistants and accessibility tools to more sophisticated multimodal systems capable of understanding both speech and visual information.
Papers
November 3, 2024
October 1, 2024
June 25, 2024
June 18, 2024
June 15, 2024
June 13, 2024
June 10, 2024
June 1, 2024
May 29, 2024
November 30, 2023
October 14, 2023
July 3, 2023
June 19, 2023
June 8, 2023
June 1, 2023
May 25, 2023
May 4, 2023
November 11, 2022
October 28, 2022