Generative Speech
Generative speech research focuses on creating systems capable of producing realistic and controllable speech from various inputs, such as text or other audio. Current efforts concentrate on developing robust models, often leveraging neural codecs and large language models, to handle diverse tasks including text-to-speech, voice conversion, and speech enhancement, even in noisy conditions. These advancements are significant for applications ranging from personalized voice assistants and dubbing to improving accessibility for individuals with speech impairments, and also address concerns around the malicious use of synthetic speech through techniques like watermarking.
Papers
September 5, 2024
September 27, 2023
August 14, 2023
July 18, 2023
June 1, 2023