Backdoor Model

Backdoor attacks compromise machine learning models by embedding malicious behavior triggered by specific inputs, undermining model trustworthiness and posing significant security risks. Current research focuses on developing both detection methods, leveraging techniques like decision boundary analysis and graph neural networks to identify compromised models, and defense mechanisms, including data-free model repair, robust knowledge distillation, and honeypot modules to mitigate backdoor effects. These efforts are crucial for ensuring the security and reliability of machine learning systems across various applications, from image classification to natural language processing, particularly in sensitive contexts like autonomous driving or medical diagnosis.

Papers