Mongolian Text to Speech

Mongolian text-to-speech (TTS) synthesis research focuses on developing high-quality speech generation systems for this low-resource language, addressing the scarcity of training data. Current efforts utilize convolutional neural networks (CNNs) and other efficient architectures like FastSpeech2, often incorporating data augmentation techniques to improve model robustness and reduce training time. These advancements are significant because they provide accessible tools for researchers and enable practical applications such as speech assistance and language learning resources for Mongolian speakers.

Papers