Spoken Term
Spoken term detection (STD) focuses on identifying specific words or phrases within audio recordings, a crucial task in speech technology and related fields. Current research emphasizes improving STD accuracy and efficiency using deep learning models, particularly transformer-based architectures and recurrent neural networks like LSTMs, often incorporating techniques like contrastive learning and multi-task training to leverage unlabeled data and improve robustness. These advancements aim to reduce reliance on large, manually labeled datasets and enhance performance across diverse languages and acoustic conditions, impacting applications such as keyword spotting, information retrieval from audio archives, and human-computer interaction.
Papers
October 5, 2024
August 31, 2024
July 5, 2024
March 19, 2023
January 8, 2023
November 26, 2022
November 2, 2022
October 27, 2022
October 21, 2022