Natural Sounding Speech
Natural-sounding speech synthesis aims to generate human-like speech from text, focusing on improving quality, diversity, and robustness across languages and speaking styles. Current research emphasizes advancements in model architectures like diffusion models, variational autoencoders, and transformer networks, often incorporating techniques such as disentangled representations and adversarial training to enhance naturalness and control over prosody and emotion. This field is crucial for applications ranging from assistive technologies and personalized voice assistants to combating synthetic misinformation, driving ongoing efforts to develop more accurate and efficient speech synthesis systems and robust detection methods.
Papers
June 2, 2023
April 21, 2023
March 14, 2023
January 28, 2023
January 22, 2023
December 7, 2022
November 29, 2022
November 26, 2022
November 11, 2022
October 19, 2022
October 10, 2022
August 26, 2022
August 23, 2022
July 1, 2022
June 15, 2022
June 1, 2022
May 25, 2022
May 17, 2022