Articulatory Synthesis

Articulatory synthesis focuses on generating speech from descriptions of the movements of the vocal tract (articulators), offering a more natural and interpretable approach to speech synthesis than traditional methods. Current research emphasizes developing efficient and high-quality models, often employing deep learning architectures like autoencoders, generative adversarial networks (GANs), and differentiable digital signal processing (DDSP), to map articulatory features (e.g., from electromagnetic articulography) to speech waveforms. This approach holds significant promise for improving speech synthesis quality, enabling better control over synthesized speech, and facilitating research into speech production and disorders.

Papers