Sparse Adversarial Attack
Sparse adversarial attacks aim to deceive deep learning models by making minimal changes to input data, focusing on perturbing only a small subset of features (e.g., pixels in images, words in text). Current research explores efficient algorithms, such as Frank-Wolfe and gradient-based methods incorporating various sparsity-inducing regularizers (e.g., $\ell_0$ norm, group norms), to generate these attacks and improve their effectiveness and interpretability. This research is significant because it reveals vulnerabilities in deep learning models and provides insights into their robustness, impacting both the development of more resilient models and the understanding of their limitations in safety-critical applications.
Papers
November 25, 2024
November 29, 2023
December 14, 2022
July 8, 2022
May 19, 2022
March 18, 2022