Self Supervised Speech
Self-supervised speech (SSS) leverages unlabeled audio data to learn robust speech representations, aiming to improve downstream tasks like speech recognition and translation without relying heavily on expensive labeled datasets. Current research focuses on understanding what information these models (e.g., WavLM, Wav2Vec 2.0) learn, comparing their representations to those of human brains and other models, and exploring efficient model architectures for resource-constrained environments. This approach holds significant promise for advancing speech processing in low-resource settings and improving applications ranging from speech-to-text translation to mental health screening through the development of novel speech-based biomarkers.
Papers
October 17, 2024
February 29, 2024
January 31, 2024
October 7, 2023
December 30, 2022
November 10, 2022
April 11, 2022