Neural Speech Enhancement
Neural speech enhancement aims to improve the quality and intelligibility of speech signals degraded by noise and reverberation. Current research focuses on developing computationally efficient models, such as LSTM-based networks and autoencoders, often incorporating multi-modal data (audio-visual) or leveraging techniques like latent diffusion and multi-band processing for improved performance. These advancements are significant for applications ranging from hearing aids and voice assistants to robust speech recognition in challenging acoustic environments, driving improvements in both objective metrics and subjective listening experiences.
Papers
Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments
Kouhei Sekiguchi, Aditya Arie Nugraha, Yicheng Du, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii
Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments
Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii