Sparse Attack

Sparse attacks, a subfield of adversarial machine learning, focus on crafting minimally-perturbed inputs—modifying only a few features—to fool machine learning models. Current research emphasizes developing efficient algorithms for generating these attacks (e.g., variations of projected gradient descent and Bayesian methods) across various model architectures, including convolutional neural networks and vision transformers, and for different data types (images, graphs). This research is crucial for evaluating the robustness of deployed machine learning systems, particularly in safety-critical applications, and for developing more resilient models.

Papers