Shot Voice Cloning
Shot voice cloning focuses on synthesizing speech in a new voice using limited training data, aiming to create natural-sounding speech with high speaker similarity. Current research emphasizes zero-shot and few-shot scenarios, employing architectures like transformers, GANs, and recurrent networks, often incorporating multi-modal learning and meta-learning techniques to improve efficiency and performance. This field is significant for its potential to enhance text-to-speech systems, personalize voice assistants, and enable more accessible speech synthesis across multiple languages and speakers, particularly in low-resource settings.
Papers
October 31, 2024
October 16, 2024
August 28, 2024
June 22, 2024
June 6, 2024
October 6, 2023
October 21, 2022
March 18, 2022