Backdoor Attack
Backdoor attacks exploit vulnerabilities in machine learning models by embedding hidden triggers during training, causing the model to produce malicious outputs when the trigger is present. Current research focuses on developing and mitigating these attacks across various model architectures, including deep neural networks, vision transformers, graph neural networks, large language models, and spiking neural networks, with a particular emphasis on understanding attack mechanisms and developing robust defenses in federated learning and generative models. The significance of this research lies in ensuring the trustworthiness and security of increasingly prevalent machine learning systems across diverse applications, ranging from object detection and medical imaging to natural language processing and autonomous systems.
Papers
TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Xuanli He, Jun Wang, Qiongkai Xu, Pasquale Minervini, Pontus Stenetorp, Benjamin I. P. Rubinstein, Trevor Cohn
Let's Focus: Focused Backdoor Attack against Federated Transfer Learning
Marco Arazzi, Stefanos Koffas, Antonino Nocera, Stjepan Picek
Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World
Wen Yin, Jian Lou, Pan Zhou, Yulai Xie, Dan Feng, Yuhua Sun, Tailai Zhang, Lichao Sun
CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction
Wenhao Lan, Yijun Yang, Haihua Shen, Shan Li
Dual Model Replacement:invisible Multi-target Backdoor Attack based on Federal Learning
Rong Wang, Guichen Zhou, Mingjun Gao, Yunpeng Xiao
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Javier Rando, Francesco Croce, Kryštof Mitka, Stepan Shabalin, Maksym Andriushchenko, Nicolas Flammarion, Florian Tramèr
Detector Collapse: Physical-World Backdooring Object Detection to Catastrophic Overload or Blindness in Autonomous Driving
Hangtao Zhang, Shengshan Hu, Yichen Wang, Leo Yu Zhang, Ziqi Zhou, Xianlong Wang, Yanjun Zhang, Chao Chen
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data
Zixuan Zhu, Rui Wang, Cong Zou, Lihua Jing
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, Nicholas Carlini
UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models
Zihan Guan, Mengxuan Hu, Sheng Li, Anil Vullikanti