Speaking Style
Speaking style research focuses on understanding and manipulating the various acoustic and prosodic features that characterize how individuals speak, aiming to improve speech synthesis and analysis. Current research employs diverse approaches, including probabilistic attribute embeddings for spoofing detection, style-controllable generative models (like diffusion models and VAEs) for manipulating speech characteristics, and hierarchical transformers for context-aware style prediction in long-form speech. This work has significant implications for applications such as voice conversion, text-to-speech synthesis, and human-robot interaction, enabling more natural and expressive synthetic speech and improved understanding of human communication.