Personalized Speech Enhancement

Personalized speech enhancement (PSE) aims to improve audio quality by tailoring noise reduction and echo cancellation to individual speakers' voices. Current research focuses on developing efficient and robust PSE models, often employing deep learning architectures like recurrent neural networks (RNNs), convolutional neural networks (CNNs), and transformers, with a strong emphasis on integrating speaker embeddings effectively and minimizing computational overhead for real-time applications. This field is significant for improving the user experience in various applications, such as teleconferencing and hearing aids, by enhancing speech intelligibility in noisy environments.

Papers