Low Resource Text to Speech

Low-resource text-to-speech (TTS) research focuses on generating high-quality synthetic speech from limited training data, addressing the significant data scarcity problem for many languages. Current efforts explore techniques like transfer learning across languages using various input representations (phonetic features, articulatory features), data augmentation strategies (noise addition, pitch shifting, voice conversion), and semi-supervised learning methods to improve model performance with minimal data. These advancements are crucial for expanding access to speech technologies for under-resourced languages and communities, impacting fields like accessibility, language preservation, and personalized voice assistants.

Papers