Acoustic Word Embeddings
Acoustic word embeddings (AWEs) are fixed-length vector representations of spoken words, aiming to capture both phonetic and semantic information for improved speech processing. Current research focuses on enhancing AWE models using techniques like self-supervised learning (e.g., HuBERT, Wav2vec 2.0), multi-view learning (combining acoustic and textual data), and various deep metric learning loss functions (e.g., proxy losses). These advancements are improving performance in diverse applications, including keyword spotting, speech emotion recognition, and low-resource language processing, by enabling more accurate and efficient analysis of spoken language.
Papers
June 8, 2024
June 7, 2024
March 13, 2024
February 4, 2024
January 19, 2024
November 15, 2023
August 28, 2023
July 25, 2023
July 5, 2023
June 3, 2023
June 1, 2023
February 26, 2023
January 8, 2023
January 3, 2023
November 17, 2022
October 30, 2022
October 28, 2022
October 21, 2022
October 11, 2022
September 29, 2022