Backdoor Purification
Backdoor purification aims to remove malicious backdoors implanted in deep neural networks (DNNs) during training, restoring the model's integrity without significantly impacting its performance on legitimate data. Current research focuses on techniques like fine-tuning model weights or activations, often guided by metrics such as Fisher Information to optimize the purification process and minimize clean data accuracy loss. These methods are being evaluated across various DNN architectures and attack types, with a growing emphasis on efficient and robust solutions that require minimal labeled data or even operate effectively with unlabeled data, thereby enhancing the security and trustworthiness of deployed DNN models.
Papers
October 13, 2024
September 1, 2024
July 14, 2024
May 18, 2024
October 3, 2023
June 30, 2023