Spontaneous Speech
Spontaneous speech research focuses on understanding and modeling the complexities of naturally occurring conversation, aiming to improve automatic speech recognition (ASR) and applications like Alzheimer's disease detection. Current research employs diverse machine learning models, including transformers, convolutional neural networks, and recurrent neural networks, often incorporating multimodal data (audio and text) and advanced techniques like attention mechanisms and data augmentation to enhance performance. This field is significant for advancing ASR technology across multiple languages and for developing reliable diagnostic tools for neurological disorders, leveraging the rich information embedded within spontaneous speech patterns.
Papers
CogniVoice: Multimodal and Multilingual Fusion Networks for Mild Cognitive Impairment Assessment from Spontaneous Speech
Jiali Cheng, Mohamed Elgaar, Nidhi Vakil, Hadi Amiri
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models
Weiqin Li, Peiji Yang, Yicheng Zhong, Yixuan Zhou, Zhisheng Wang, Zhiyong Wu, Xixin Wu, Helen Meng