Attack Efficacy

Attack efficacy research focuses on evaluating the effectiveness of various methods to compromise machine learning models, encompassing data poisoning, model manipulation, and prompt injection attacks. Current research investigates attack strategies across diverse model architectures, including federated learning, recommender systems, vision-language models, and large language models, often employing techniques like adversarial examples, logits poisoning, and profile injection. Understanding attack efficacy is crucial for improving the robustness and security of machine learning systems, impacting the reliability of AI applications across numerous domains. This necessitates the development of robust evaluation metrics and defense mechanisms.

Papers