Adversarial Example
Adversarial examples are subtly altered inputs designed to fool machine learning models, primarily deep neural networks (DNNs), into making incorrect predictions. Current research focuses on improving model robustness against these attacks, exploring techniques like ensemble methods, multi-objective representation learning, and adversarial training, often applied to architectures such as ResNets and Vision Transformers. Understanding and mitigating the threat of adversarial examples is crucial for ensuring the reliability and security of AI systems across diverse applications, from image classification and natural language processing to malware detection and autonomous driving. The development of robust defenses and effective attack detection methods remains a significant area of ongoing investigation.
Papers
Reverse engineering adversarial attacks with fingerprints from adversarial examples
David Aaron Nicholson, Vincent Emanuele
RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion
Zhuoqun Huang, Neil G. Marchant, Keane Lucas, Lujo Bauer, Olga Ohrimenko, Benjamin I. P. Rubinstein
Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers
Sungmin Cha, Sungjun Cho, Dasol Hwang, Honglak Lee, Taesup Moon, Moontae Lee
Adapting Step-size: A Unified Perspective to Analyze and Improve Gradient-based Methods for Adversarial Attacks
Wei Tao, Lei Bao, Sheng Long, Gaowei Wu, Qing Tao
Unleashing the Power of Visual Prompting At the Pixel Level
Junyang Wu, Xianhang Li, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie
In and Out-of-Domain Text Adversarial Robustness via Label Smoothing
Yahan Yang, Soham Dan, Dan Roth, Insup Lee
Multi-head Uncertainty Inference for Adversarial Attack Detection
Yuqi Yang, Songyun Yang, Jiyang Xie. Zhongwei Si, Kai Guo, Ke Zhang, Kongming Liang
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
Bairu Hou, Jinghan Jia, Yihua Zhang, Guanhua Zhang, Yang Zhang, Sijia Liu, Shiyu Chang
Discrete Point-wise Attack Is Not Enough: Generalized Manifold Adversarial Attack for Face Recognition
Qian Li, Yuxiao Hu, Ye Liu, Dongxiao Zhang, Xin Jin, Yuntian Chen