Fine Grained Prosody
Fine-grained prosody research focuses on accurately modeling and manipulating the subtle variations in speech intonation, rhythm, and stress that convey emotion and meaning beyond the spoken words. Current efforts concentrate on developing end-to-end models, often employing variational autoencoders (VAEs) or hierarchical architectures, to disentangle prosody from other speech characteristics like speaker identity and background noise, achieving better control and transferability. This work is significant for advancing speech synthesis, enabling more natural and expressive synthetic speech, and improving applications like cross-speaker style transfer and emotionally nuanced speech generation.
Papers
September 13, 2024
July 18, 2024
June 20, 2023
March 14, 2023
June 27, 2022
April 11, 2022