Singing Voice

Singing voice research focuses on understanding and manipulating the acoustic properties of singing, primarily aiming to improve singing voice synthesis (SVS) and related technologies like voice conversion. Current research heavily utilizes deep learning models, including diffusion models, variational autoencoders, and transformers, often incorporating self-supervised learning to address data scarcity and improve the controllability and naturalness of synthesized voices. These advancements have implications for music production, virtual singers, accessibility technologies for the vocally impaired, and the detection of AI-generated "deepfakes," highlighting the growing importance of this interdisciplinary field.

Papers