Trigger Attack

Trigger attacks exploit vulnerabilities in machine learning models by introducing malicious input patterns (triggers) that cause misclassification. Current research focuses on understanding and mitigating these attacks across various model types, including image classifiers, graph neural networks, and language models, often employing techniques like vector quantization or weakly supervised learning for attack and defense. The significance lies in ensuring the robustness and trustworthiness of machine learning systems deployed in critical applications, ranging from image recognition to federated learning, where the consequences of model manipulation can be severe. A key challenge is developing defenses that are effective against diverse and evolving attack strategies, including multiple simultaneous triggers and attacks that adapt over time.

Papers