Expressive Speech

Expressive speech synthesis aims to generate speech that conveys not only linguistic content but also emotional nuances and stylistic variations, mirroring the richness of human communication. Current research focuses on improving the expressiveness of models, often employing techniques like diffusion models, variational autoencoders, and graph neural networks, and incorporating linguistic features (e.g., emphasis, semantics) to enhance control and naturalness. Advances in this field have significant implications for applications such as virtual assistants, audiobooks, and accessibility technologies, while also providing valuable insights into the computational modeling of human communication.

Papers

June 14, 2022

Universally Expressive Communication in Multi-Agent Reinforcement Learning
Matthew Morris, Thomas D. Barrett, Arnu Pretorius
Multi Agent Reinforcement Learning Expressive Speech Efficient Protocol Joint Selection

April 21, 2022

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana
Voice Conversion Expressive Speech Emotional Text to Speech Low Resource Text to Speech

April 10, 2022

Expressiveness and Approximation Properties of Graph Neural Networks
Floris Geerts, Juan L. Reutter
Graph Neural Network GNN Architecture Expressive Speech Graph Learning Task Approximation Property Specific GNN

March 28, 2022

Expressive Talking Head Video Encoding in StyleGAN2 Latent-Space
Trevine Oorloff, Yaser Yacoob
StyleGAN Latent Facial Motion Expressive Speech Video Coding Face Video Latent Space Editing Facial Deformation

February 19, 2022

Is there an aesthetic component of language?
Harshit Parmar, Jeffrey P. Williams
Human Language Expressive Speech Behavior Expressivity Style Morphological Feature Aesthetic Feature Historical Linguistics

December 23, 2021

Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios
Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie, Guoqiao Yu, Guanglu Wan
Style Transfer Speech Synthesis Synthesized Speech Expressive Speech Single Speaker

November 29, 2021

Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces
Ryan Louie, Jesse Engel, Anna Huang
Generative Model New Framework Expressive Speech Steering Technique HCI Research Co Creation HCI Education

November 19, 2021

Word-Level Style Control for Expressive, Non-attentive Speech Synthesis
Konstantinos Klapsas, Nikolaos Ellinas, June Sig Sung, Hyoungmin Park, Spyros Raptis
Speech Synthesis Speech Data Prosodic Feature Style Representation Speech Encoder Expressive Speech Expressive Speech Synthesis

November 11, 2021

DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks
Pál András Papp, Karolis Martinkus, Lukas Faber, Roger Wattenhofer
Graph Neural Network Gene Level GNN Structured Dropout Expressive Speech GNN Framework Benchmarking Deep Learning