Attack Success Rate

Attack success rate (ASR) quantifies the effectiveness of adversarial attacks against machine learning models, focusing on compromising their security and reliability. Current research investigates ASR across various model types, including large language models (LLMs), federated learning systems, and text-to-image generators, employing diverse attack methods like gradient-based optimization, backdoor insertion, and prompt engineering. Understanding and improving ASR is crucial for developing robust and secure AI systems, impacting both the theoretical foundations of machine learning and the practical deployment of AI in sensitive applications. The field is actively exploring both improved attack strategies and more effective defenses.

Papers