Adversarial Robustness Distillation

Adversarial robustness distillation aims to transfer the robustness of a large, adversarially trained "teacher" model to a smaller, more efficient "student" model, mitigating the computational cost and size limitations of direct adversarial training. Current research focuses on improving knowledge transfer techniques, exploring diverse teacher ensembles (including heterogeneous architectures), and addressing issues like fairness and the accuracy-robustness trade-off through methods such as soft label distillation and gradient matching. This approach holds significant promise for deploying robust deep learning models in resource-constrained environments and applications where adversarial attacks pose a critical risk, such as autonomous driving or medical diagnosis.

Papers