Multi Speaker Text to Speech
Multi-speaker text-to-speech (TTS) aims to synthesize realistic speech from text for a variety of speakers, even those unseen during training. Current research focuses on improving the efficiency and quality of these systems, exploring techniques like frame selection, data augmentation with large language models, and the use of pre-trained models adapted via methods such as hypernetworks or contrastive learning. These advancements are significant because they address limitations in data availability and computational resources, paving the way for more versatile and accessible speech synthesis applications across diverse languages and speaker demographics.
Papers
July 11, 2022
May 24, 2022
March 29, 2022
February 22, 2022
January 27, 2022
January 19, 2022