Paper ID: 2301.06871

Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks

Lars Lien Ankile, Anna Midgley, Sebastian Weisshaar

Neural Networks are infamously sensitive to small perturbations in their inputs, making them vulnerable to adversarial attacks. This project evaluates the performance of Denoising Diffusion Probabilistic Models (DDPM) as a purification technique to defend against adversarial attacks. This works by adding noise to an adversarial example before removing it through the reverse process of the diffusion model. We evaluate the approach on the PatchCamelyon data set for histopathologic scans of lymph node sections and find an improvement of the robust accuracy by up to 88\% of the original model's accuracy, constituting a considerable improvement over the vanilla model and our baselines. The project code is located at https://github.com/ankile/Adversarial-Diffusion.

Submitted: Jan 17, 2023

Topics

Diffusion Model
Adversarial Attack
Adversarial Example
Denoising Diffusion Probabilistic Model

Links

arXiv PDF