Backdoor Defense

Backdoor attacks subtly inject malicious functionality into machine learning models during training, causing them to misbehave when presented with specific "triggers." Current research focuses on developing robust defenses against these attacks across various model architectures, including graph neural networks, large language models, diffusion models, and multimodal contrastive learning models like CLIP, employing techniques such as trigger inversion, model unlearning, and data augmentation. Effective backdoor defense is crucial for ensuring the reliability and security of machine learning systems in diverse applications, ranging from autonomous driving to cybersecurity, and is a significant area of ongoing investigation within the broader field of adversarial machine learning.

Papers