Speech Denoising

Speech denoising aims to remove unwanted noise from audio recordings, improving speech quality and intelligibility. Current research focuses on developing robust models, including diffusion models, generative adversarial networks, and those based on self-attention and vector quantization, often operating directly on waveforms or time-frequency representations to avoid the limitations of intermediate steps like vocoding. These advancements leverage both supervised and self-supervised learning techniques, with a growing emphasis on handling diverse noise types and limited clean data availability. Improved speech denoising has significant implications for applications ranging from assistive listening devices to robust automatic speech recognition systems and bioacoustic analysis.

Papers