State of the Art Whisper
Whisper, a large-scale multilingual speech recognition model, is the focus of intense research aimed at improving its accuracy, efficiency, and robustness across diverse speech characteristics and applications. Current research emphasizes adapting Whisper for low-resource languages, improving streaming capabilities, mitigating adversarial attacks, and integrating it with other modalities like vision for audio-visual speech recognition. These advancements have significant implications for various fields, including healthcare (e.g., aphasia diagnosis), accessibility (e.g., improved speech-to-text for individuals with speech impairments), and security (e.g., developing defenses against malicious audio manipulation).
Papers
July 27, 2023
July 18, 2023
July 6, 2023
July 4, 2023
June 5, 2023
June 2, 2023
May 18, 2023
March 3, 2023
March 1, 2023
February 18, 2023
January 24, 2023
October 26, 2022