Whisper Model
Whisper is a large, pre-trained multilingual speech recognition model achieving state-of-the-art performance in various tasks, including automatic speech recognition (ASR), speaker verification, and even deepfake detection. Current research focuses on enhancing Whisper's accuracy and robustness for low-resource languages and diverse speaker characteristics, often employing techniques like retrieval augmentation, knowledge distillation, and adaptive compression to improve efficiency and reduce computational costs. These advancements are significant for expanding access to speech technology and improving its reliability across diverse applications, from personalized assistants to combating misinformation.
Papers
October 24, 2024
September 18, 2024
August 28, 2024
August 10, 2024
June 21, 2024
June 14, 2024
June 13, 2024
May 6, 2024
January 30, 2024
January 18, 2024
November 1, 2023
September 17, 2023
July 24, 2023
June 2, 2023
May 18, 2023
May 15, 2023