Reverberant Speech
Reverberant speech, characterized by sound reflections that distort speech signals, presents a significant challenge in audio processing. Current research focuses on developing robust methods for speech dereverberation and room impulse response (RIR) estimation, employing deep learning architectures like Generative Adversarial Networks (GANs) and neural networks for tasks such as complex time-frequency masking and feature encoding. These advancements aim to improve speech recognition accuracy, enhance audio quality in various applications (e.g., hearing aids, virtual reality), and enable more accurate room acoustic parameter estimation directly from speech recordings. The ultimate goal is to create systems that can effectively process and understand speech even in complex acoustic environments.
Papers
High Fidelity Neural Audio Compression
Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Marvin Lavechin, Marianne Métais, Hadrien Titeux, Alodie Boissonnet, Jade Copet, Morgane Rivière, Elika Bergelson, Alejandrina Cristia, Emmanuel Dupoux, Hervé Bredin