Speech Quality
Speech quality assessment aims to objectively and subjectively measure the clarity and pleasantness of speech signals, crucial for applications ranging from telecommunications to clinical voice analysis. Current research focuses on developing accurate and efficient automatic speech quality assessment models, often employing deep neural networks like Convolutional Neural Networks (CNNs), Conformers, and large language models (LLMs), alongside techniques like self-supervised learning and quantization to reduce computational demands for real-time applications. These advancements are significant for improving the user experience in various technologies and for enabling more objective clinical evaluations of voice disorders.
Papers
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding
Tan Dat Nguyen, Ji-Hoon Kim, Jeongsoo Choi, Shukjae Choi, Jinseok Park, Younglo Lee, Joon Son Chung
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
Shrishti Saha Shetu, Emanuël A. P. Habets, Andreas Brendel