Articulatory Representation

Articulatory representation focuses on modeling the movements of speech articulators (e.g., tongue, lips) and their relationship to speech sounds, aiming to create more natural and interpretable speech processing systems. Current research emphasizes developing robust methods for inferring articulatory parameters from acoustic signals, using techniques like neural networks (including generative adversarial networks and autoencoders), factor graphs, and matrix factorization, often incorporating multimodal data (audio and visual). This work has significant implications for improving speech synthesis, recognition, and the understanding of speech production mechanisms, particularly in challenging scenarios like dysarthric speech or low-resource languages, and for enabling more sophisticated robot-object interaction.

Papers