Black Box Adversarial Attack

Black-box adversarial attacks aim to fool machine learning models without knowledge of their internal workings, primarily by crafting subtly altered inputs (adversarial examples) that cause misclassification. Current research focuses on improving attack efficiency and transferability across different models and datasets, employing techniques like meta-learning, Bayesian optimization, and reinforcement learning, often within specific application domains such as image recognition, natural language processing, and autonomous driving. These attacks highlight critical vulnerabilities in deployed machine learning systems, driving the development of more robust models and defenses with significant implications for security and reliability in various applications.

Papers