Personalized Voice Activity Detection

Personalized Voice Activity Detection (PVAD) aims to accurately identify a specific speaker's voice amidst background noise and other voices, improving the performance of applications like speech recognition and hands-free communication. Current research focuses on enhancing PVAD robustness using techniques like self-supervised pretraining with LSTM-encoders and exploring alternative input methods such as bone-conduction microphones for improved signal isolation and reduced power consumption. These advancements are crucial for enabling more accurate and efficient personalized voice-driven technologies in resource-constrained devices and noisy environments.

Papers