Paper ID: 2409.14743 • Published Sep 23, 2024
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
Hieu-Thi Luong, Haoyang Li, Lin Zhang, Kong Aik Lee, Eng Siong Chng
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Previous fake speech datasets were constructed from a defender's perspective
to develop countermeasure (CM) systems without considering diverse motivations
of attackers. To better align with real-life scenarios, we created
LlamaPartialSpoof, a 130-hour dataset that contains both fully and partially
fake speech, using a large language model (LLM) and voice cloning technologies
to evaluate the robustness of CMs. By examining valuable information for both
attackers and defenders, we identify several key vulnerabilities in current CM
systems, which can be exploited to enhance attack success rates, including
biases toward certain text-to-speech models or concatenation methods. Our
experimental results indicate that the current fake speech detection system
struggle to generalize to unseen scenarios, achieving a best performance of
24.49% equal error rate.