Keyword Spotting
Keyword spotting (KWS) focuses on efficiently and accurately detecting predefined words within continuous audio streams, a crucial component in voice-activated devices and other applications. Current research emphasizes improving KWS robustness in noisy environments and resource-constrained settings, exploring techniques like contrastive learning, multi-task learning, and novel architectures such as Transformers and Spiking Neural Networks, often incorporating attention mechanisms and efficient feature extraction methods. These advancements aim to enhance accuracy, reduce latency and energy consumption, and enable personalized and multilingual KWS capabilities, impacting fields ranging from voice assistants to aviation safety.
Papers
Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder
Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik
PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords
Yong-Hyeok Lee, Namhyun Cho
Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data
Seunghan Yang, Byeonggeun Kim, Kyuhong Shim, Simyung Chang