Backdoor Learning

Backdoor learning describes the malicious insertion of vulnerabilities into machine learning models during training, causing them to produce incorrect outputs when presented with specific "trigger" inputs. Current research focuses on developing both more sophisticated attacks and robust defenses across various model architectures, including deep neural networks (DNNs), graph neural networks (GNNs), and multimodal contrastive learning models like CLIP, often employing techniques like unlearning, model pruning, and contrastive learning. Understanding and mitigating backdoor attacks is crucial for ensuring the trustworthiness and security of machine learning systems deployed in sensitive applications, driving ongoing efforts to establish standardized benchmarks and develop effective detection and mitigation strategies.

Papers