Synthetic Speech Detection

Synthetic speech detection aims to distinguish artificially generated speech from human speech, combating the increasing threat of audio deepfakes and fraudulent voice impersonation. Current research heavily utilizes deep learning, employing architectures like Transformers, ResNets, and variations thereof, often incorporating techniques such as multi-head self-attention and feature fusion to improve accuracy and robustness against various synthesis methods and noise. This field is crucial for safeguarding against financial fraud, misinformation campaigns, and identity theft, driving ongoing efforts to develop more generalizable and interpretable detection models that are resilient to adversarial attacks and compression artifacts.

Papers

May 16, 2022

Transferability of Adversarial Attacks on Synthetic Speech Detection
Jiacheng Deng, Shunyi Chen, Li Dong, Diqun Yan, Rangding Wang
Adversarial Attack Task Transferability Synthetic Speech Detection Synthetic Speech Detector

January 24, 2022

Synthetic speech detection using meta-learning with prototypical loss
Monisankha Pal, Aditya Raikar, Ashish Panda, Sunil Kumar Kopparapu
Anti Spoofing Synthetic Speech Detection Squeeze Excitation Kernel Induced Loss Classification Loss Function

Synthetic Speech Detection

Papers

Transferability of Adversarial Attacks on Synthetic Speech Detection

Synthetic speech detection using meta-learning with prototypical loss