Speech Quality Prediction

Speech quality prediction aims to automatically assess the perceived quality of speech signals, typically using machine learning models to predict human ratings (e.g., Mean Opinion Score). Current research focuses on improving prediction accuracy across diverse scenarios (e.g., synthesized speech, noisy audio) using deep learning architectures like convolutional and recurrent neural networks, often incorporating pre-trained models and techniques like quantization-aware training for resource efficiency. These advancements are crucial for improving the quality of experience in applications such as teleconferencing and speech synthesis, while also providing valuable insights into human speech perception.

Papers