Speech Enhancement Model
Speech enhancement models aim to improve the clarity of speech signals degraded by noise and reverberation, primarily for applications like hearing aids and voice assistants. Current research emphasizes developing models with ultra-low latency for real-time applications, often employing techniques like asymmetric windows, adaptive filterbanks, and novel architectures such as the Mamba network, while also exploring the use of self-supervised speech representations to improve training and reduce model size. This field is crucial for improving the accessibility and usability of speech technologies, impacting areas such as hearing healthcare, communication systems, and human-computer interaction.
Papers
PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement
Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement
Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj