Neural Tt

Neural text-to-speech (TTS) research focuses on creating high-quality, natural-sounding synthetic speech using neural networks. Current efforts concentrate on improving control over prosody and emotion, adapting models to new speakers with minimal data, and enhancing performance for low-resource languages through techniques like vector quantization and normalizing flows. These advancements leverage architectures such as neural HMMs and autoregressive models, aiming to produce more expressive, natural, and versatile synthetic speech. The resulting improvements have significant implications for applications ranging from accessibility technologies to virtual assistants and interactive storytelling.

Papers

November 24, 2022

Prosody-controllable spontaneous TTS with neural HMMs
Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely
Text to Speech Spontaneous Speech Neural Tt Prosody Control

November 13, 2022

OverFlow: Putting flows on top of neural transducers for better TTS
Shivam Mehta, Ambika Kirkland, Harm Lameris, Jonas Beskow, Éva Székely, Gustav Eje Henter
Speech Synthesis General Flow Neural Transducer TOp Front TopMost Neural Tt

November 1, 2022

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Alexandra Vioni, Georgia Maniati, Nikolaos Ellinas, June Sig Sung, Inchul Hwang, Aimilios Chalamandaris, Pirros Tsiakoulis
Prosodic Feature Linguistic Feature Naturalness Assessment Neural Tt Speech Naturalness Neural Text to Speech

October 28, 2022

Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding
Text to Speech Speaker Adaptation Multi Speaker Speaker Similarity Neural Tt Shot Speaker

October 27, 2022

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng
Low Resource Language Text to Speech Codebook Learning Low Resource Text to Speech Neural Tt

September 22, 2022

July 13, 2022

Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS
Yookyung Shin, Younggun Lee, Suhee Jo, Yeongtae Hwang, Taesu Kim
Style Transfer Expressive Speech Style Encoder Multi Speaker Tt Expressive Text to Speech Neural Tt

June 30, 2022

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Kyle Kastner, Aaron Courville
Text to Speech Autoregressive Neural Network Spectral Model Neural Tt

January 24, 2022

Disentangling Style and Speaker Attributes for TTS Style Transfer
Xiaochun An, Frank K. Soong, Lei Xie
Style Transfer Speaker Characteristic Style Disentanglement Neural Tt