Adversarial Backdoor

Adversarial backdoor attacks exploit vulnerabilities in machine learning models by embedding hidden triggers that cause misclassification without significantly impacting performance on clean data. Current research focuses on developing both more sophisticated attacks, including those leveraging generative models and targeting diverse model architectures like LLMs and those used in autonomous driving, and more robust defenses, such as those employing data augmentation techniques, model editing, and contributor-aware approaches. The significance of this research lies in its implications for the security and reliability of AI systems across various applications, from facial recognition and autonomous vehicles to federated learning and large language models.

Papers