Paper ID: 2212.10103

VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion

Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Shunhui Ji

Keyword spotting (KWS) based on deep neural networks (DNNs) has achieved massive success in voice control scenarios. However, training of such DNN-based KWS systems often requires significant data and hardware resources. Manufacturers often entrust this process to a third-party platform. This makes the training process uncontrollable, where attackers can implant backdoors in the model by manipulating third-party training data. An effective backdoor attack can force the model to make specified judgments under certain conditions, i.e., triggers. In this paper, we design a backdoor attack scheme based on Voiceprint Selection and Voice Conversion, abbreviated as VSVC. Experimental results demonstrated that VSVC is feasible to achieve an average attack success rate close to 97% in four victim models when poisoning less than 1% of the training data.

Submitted: Dec 20, 2022