Backdoor Detection
Backdoor detection in machine learning focuses on identifying malicious modifications to models that trigger unintended behavior when specific input patterns (triggers) are present. Current research emphasizes developing robust detection methods for various model architectures, including diffusion models, language models, and graph neural networks, often employing techniques like tensor decomposition, uncertainty analysis, and distribution inference to identify anomalies indicative of backdoors. The significance of this research lies in safeguarding the integrity and trustworthiness of machine learning systems across diverse applications, mitigating risks associated with compromised models in sensitive domains.
Papers
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
Task-Agnostic Detector for Insertion-Based Backdoor Attacks
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen