Diffusion Based Speech Enhancement

Diffusion-based speech enhancement leverages generative models to improve the quality of noisy speech signals by reversing a noise diffusion process. Current research focuses on optimizing model architectures, such as score-based diffusion models, to reduce computational cost and improve performance, particularly in mismatched conditions, through techniques like Brownian bridge processes and efficient samplers (e.g., Heun-based). This approach shows promise in surpassing traditional discriminative methods, particularly for tasks involving non-additive distortions like dereverberation, and offers potential for significant advancements in speech processing applications, including speaker verification and speech separation.

Papers