Backdoor Attack
Backdoor attacks exploit vulnerabilities in machine learning models by embedding hidden triggers during training, causing the model to produce malicious outputs when the trigger is present. Current research focuses on developing and mitigating these attacks across various model architectures, including deep neural networks, vision transformers, graph neural networks, large language models, and spiking neural networks, with a particular emphasis on understanding attack mechanisms and developing robust defenses in federated learning and generative models. The significance of this research lies in ensuring the trustworthiness and security of increasingly prevalent machine learning systems across diverse applications, ranging from object detection and medical imaging to natural language processing and autonomous systems.
Papers
Understanding Impacts of Task Similarity on Backdoor Attack and Detection
Di Tang, Rui Zhu, XiaoFeng Wang, Haixu Tang, Yi Chen
Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork
Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, Zhangyang Wang
Few-shot Backdoor Attacks via Neural Tangent Kernels
Jonathan Hayase, Sewoong Oh