Transferable Adversarial Attack

Transferable adversarial attacks aim to create perturbations in input data that fool multiple machine learning models, even those unseen during the attack's design. Current research focuses on improving the transferability of these attacks across diverse model architectures (including Vision Transformers, GANs, and diffusion models) and tasks (e.g., image classification, object detection, and language modeling), often employing techniques like gradient editing, contrastive learning, and frequency-domain manipulation. This research is crucial for evaluating the robustness of machine learning systems and informing the development of more secure and reliable AI applications, particularly in safety-critical domains.

Papers