Keyword Spotting

Keyword spotting (KWS) focuses on efficiently and accurately detecting predefined words within continuous audio streams, a crucial component in voice-activated devices and other applications. Current research emphasizes improving KWS robustness in noisy environments and resource-constrained settings, exploring techniques like contrastive learning, multi-task learning, and novel architectures such as Transformers and Spiking Neural Networks, often incorporating attention mechanisms and efficient feature extraction methods. These advancements aim to enhance accuracy, reduce latency and energy consumption, and enable personalized and multilingual KWS capabilities, impacting fields ranging from voice assistants to aviation safety.

Papers

June 9, 2024

Sparse Binarization for Fast Keyword Spotting
Jonathan Svirsky, Uri Shaham, Ofir Lindenbaum
Keyword Spotting Binarization Method Voice Activated Hand Free

June 8, 2024

Relational Proxy Loss for Audio-Text based Keyword Spotting
Youngmoon Jung, Seungjin Lee, Joon-Young Yang, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho
Text Embeddings Keyword Spotting Acoustic Word Embeddings Metric Learning Loss Keyword Enrollment Proxy Based Loss

June 4, 2024

Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
Hao Yen, Pin-Jui Ku, Sabato Marco Siniscalchi, Chin-Hui Lee
Keyword Spotting Spoken Language Robust Speaker Representation

May 23, 2024

End-to-End User-Defined Keyword Spotting using Shifted Delta Coefficients
Kesavaraj V, Anuprabha M, Anil Kumar Vuppala
Speech Signal Keyword Spotting Spoken Language Keyword Enrollment Audio Text Pair

May 19, 2024

Towards Contactless Elevators with TinyML using CNN-based Person Detection and Keyword Spotting
Anway S. Pimpalkar, Deeplaxmi V. Niture
Keyword Spotting TinyML Model Tiny Machine Learning Elevator Dispatching Human Presence Detection

April 23, 2024

Multi-Sample Dynamic Time Warping for Few-Shot Keyword Spotting
Kevin Wilkinghoff, Alessia Cornaggia-Urrigshardt
Keyword Spotting Dynamic Time Warping

March 27, 2024

Noise-Robust Keyword Spotting through Self-supervised Pretraining
Jacob Mørk, Holger Severin Bovbjerg, Gergely Kiss, Zheng-Hua Tan
Self Supervised Learning Keyword Spotting Self Supervised Pretraining Supervised Learning Method

March 20, 2024

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu
Keyword Spotting Word Detection Keyword Search Token and Duration Transducer

March 12, 2024

On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems
Cristian Cioflan, Lukas Cavigelli, Manuele Rusci, Miguel de Prado, Luca Benini
Domain Adaptation Edge Device Keyword Spotting Device Learning

January 26, 2024

Robust Dual-Modal Speech Keyword Spotting for XR Headsets
Zhuojiang Cai, Yuhan Ma, Feng Lu
Cross Modal Keyword Spotting Multimodal System Speech Interaction

January 12, 2024

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
Yu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu
Contrastive Learning Keyword Spotting Continuous Speech Audio Text Audio Detection

December 17, 2023

Meta-AF Echo Cancellation for Improved Keyword Spotting
Jonah Casebeer, Junkai Wu, Paris Smaragdis
Training Data Speech Recognition Keyword Spotting Echo Cancellation Adaptive Filter

November 6, 2023

Personalizing Keyword Spotting with Speaker Information
Beltrán Labrador, Pai Zhu, Guanlong Zhao, Angelo Scorza Scarpati, Quan Wang, Alicia Lozano-Diez, Alex Park, Ignacio López Moreno
Keyword Spotting Speaker Information Keyword Detection Feature Wise Linear Modulation

August 31, 2023

August 15, 2023

End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Bolaji Yusuf, Jan Cernocky, Murat Saraclar
Automatic Speech Recognition System Keyword Spotting Keyword Search

August 7, 2023

Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring
George Retsinas, Giorgos Sfikas, Christophoros Nikou
Object Detection Model Keyword Spotting CTC Based Borda Counting Image Level Annotation Segmentation Free

July 24, 2023

Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics
Umberto Michieli, Pablo Peso Parada, Mete Ozay
Audio Representation Keyword Spotting Online Continual Learning Resource Constrained Device Two Level Lattice Neural Network Temporal Pooling Temporal Lift Pooling

July 6, 2023

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu
Knowledge Distillation Keyword Spotting Self Supervised Model Self Supervised Speech Representation Learning Large Scale Self Supervised Self Distilled Self Supervised

Keyword Spotting

Papers

Sparse Binarization for Fast Keyword Spotting

Relational Proxy Loss for Audio-Text based Keyword Spotting

Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition

End-to-End User-Defined Keyword Spotting using Shifted Delta Coefficients

Towards Contactless Elevators with TinyML using CNN-based Person Detection and Keyword Spotting

Multi-Sample Dynamic Time Warping for Few-Shot Keyword Spotting

Noise-Robust Keyword Spotting through Self-supervised Pretraining

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer

On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems

Robust Dual-Modal Speech Keyword Spotting for XR Headsets

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech

Meta-AF Echo Cancellation for Improved Keyword Spotting

Personalizing Keyword Spotting with Speaker Information

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder

PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords

Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data

End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations

Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring

Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation