Spoken Utterance
Spoken utterance research focuses on understanding and modeling the complexities of human speech, aiming to improve automatic speech processing and human-computer interaction. Current research emphasizes developing robust models for tasks like turn-taking detection, speech enhancement, and emotion recognition, often employing deep learning architectures such as transformers and graph neural networks to capture both linguistic and acoustic features. These advancements are crucial for applications ranging from improved voice assistants and clinical diagnostics to more natural and engaging human-machine communication. Furthermore, ongoing work addresses ethical considerations, such as bias detection in speaker diarization and the detection of manipulated speech.