Backdoor Defense
Backdoor attacks subtly inject malicious functionality into machine learning models during training, causing them to misbehave when presented with specific "triggers." Current research focuses on developing robust defenses against these attacks across various model architectures, including graph neural networks, large language models, diffusion models, and multimodal contrastive learning models like CLIP, employing techniques such as trigger inversion, model unlearning, and data augmentation. Effective backdoor defense is crucial for ensuring the reliability and security of machine learning systems in diverse applications, ranging from autonomous driving to cybersecurity, and is a significant area of ongoing investigation within the broader field of adversarial machine learning.
Papers
Towards Backdoor Stealthiness in Model Parameter Space
Xiaoyun Xu, Zhuoran Liu, Stefanos Koffas, Stjepan Picek
Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data
Jiale Zhang, Bosen Rao, Chengcheng Zhu, Xiaobing Sun, Qingming Li, Haibo Hu, Xiapu Luo, Qingqing Ye, Shouling Ji