Singing Voice Deepfake Detection

Singing voice deepfake detection (SVDD) research aims to develop robust methods for distinguishing authentic singing voices from AI-generated ones. Current efforts focus on leveraging speech and music foundation models, often employing ensemble methods and novel aggregation techniques like Squeeze-and-Excitation Aggregation, to improve detection accuracy. These advancements are crucial for combating the spread of manipulated audio in the music industry and broader media landscape, with recent challenges highlighting the need for models generalizable across diverse singing styles, languages, and musical contexts. The development of large, diverse datasets like CtrSVDD is also a key focus to facilitate more robust and reliable SVDD systems.

Papers