Speech Enhancement
Speech enhancement aims to improve the clarity and intelligibility of speech signals degraded by noise and reverberation, crucial for applications like hearing aids and voice assistants. Current research focuses on developing computationally efficient models, including lightweight convolutional neural networks, recurrent neural networks (like LSTMs), and diffusion models, often incorporating techniques like multi-channel processing, attention mechanisms, and self-supervised learning to achieve high performance with minimal latency. These advancements are driving progress towards more robust and resource-efficient speech enhancement systems for a wide range of real-world applications, particularly in low-power devices and challenging acoustic environments. The field also explores the integration of visual information and advanced signal processing techniques to further enhance performance.
Papers
Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study
Avamarie Brueggeman, Takuya Higuchi, Masood Delfarah, Stephen Shum, Vineet Garg
Multichannel Voice Trigger Detection Based on Transform-average-concatenate
Takuya Higuchi, Avamarie Brueggeman, Masood Delfarah, Stephen Shum
Joint Minimum Processing Beamforming and Near-end Listening Enhancement
Andreas J. Fuglsig, Jesper Jensen, Zheng-Hua Tan, Lars S. Bertelsen, Jens Christian Lindof, Jan Østergaard
Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement
Shafique Ahmed, Chia-Wei Chen, Wenze Ren, Chin-Jou Li, Ernie Chu, Jun-Cheng Chen, Amir Hussain, Hsin-Min Wang, Yu Tsao, Jen-Cheng Hou
Diffusion-based speech enhancement with a weighted generative-supervised learning loss
Jean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement
Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling
Unsupervised speech enhancement with diffusion-based generative models
Berné Nortier, Mostafa Sadeghi, Romain Serizel
Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder
Mostafa Sadeghi, Romain Serizel
Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding
Jiahui Pan, Pengjie Shen, Hui Zhang, Xueliang Zhang