Source Speech

Source speech analysis focuses on extracting meaningful information from spoken language, encompassing tasks like transcription correction, speaker identification, emotion recognition, and topic segmentation. Current research heavily utilizes large language models (LLMs) and transformer-based architectures, often incorporating techniques like self-supervised learning, multi-task learning, and multilingual training to improve performance and robustness across diverse languages and speaking styles. These advancements are driving progress in various applications, including improved speech-to-speech translation, real-time voice conversion, and enhanced accessibility for low-resource languages.

Papers