Multi Speaker Text to Speech
Multi-speaker text-to-speech (TTS) aims to synthesize realistic speech from text for a variety of speakers, even those unseen during training. Current research focuses on improving the efficiency and quality of these systems, exploring techniques like frame selection, data augmentation with large language models, and the use of pre-trained models adapted via methods such as hypernetworks or contrastive learning. These advancements are significant because they address limitations in data availability and computational resources, paving the way for more versatile and accessible speech synthesis applications across diverse languages and speaker demographics.
Papers
August 30, 2024
August 17, 2024
July 24, 2024
June 24, 2024
June 13, 2024
May 20, 2024
April 14, 2024
April 6, 2024
January 25, 2024
January 4, 2024
October 26, 2023
August 21, 2023
May 19, 2023
December 11, 2022
November 7, 2022
November 2, 2022
October 12, 2022
September 26, 2022
August 10, 2022