Audio Deepfake

Audio deepfakes, realistic synthetic speech generated by AI, pose a significant threat due to their potential for misuse in disinformation and impersonation. Current research focuses on developing robust detection methods, exploring various model architectures including Conformers and leveraging techniques like contrastive learning and self-supervised learning to improve accuracy and generalization across different deepfake generation methods and audio manipulations. This field is crucial for advancing audio security and combating the spread of misinformation, with ongoing efforts concentrating on improving detection accuracy, particularly against unseen attacks, and reducing the computational cost of detection models.

Papers