Black Box Attack

Black-box attacks aim to compromise machine learning models without requiring knowledge of their internal workings, focusing on manipulating inputs to elicit incorrect outputs. Current research emphasizes developing query-efficient methods, often employing zeroth-order optimization, Bayesian optimization, or generative models like diffusion models, to craft adversarial examples for various model architectures, including vision transformers, large language models, and generative adversarial networks. These attacks highlight critical vulnerabilities in deployed systems across diverse applications like image recognition, natural language processing, and even physical security, underscoring the urgent need for more robust and resilient model designs and defense mechanisms.

Papers