Adversarial Example
Adversarial examples are subtly altered inputs designed to fool machine learning models, primarily deep neural networks (DNNs), into making incorrect predictions. Current research focuses on improving model robustness against these attacks, exploring techniques like ensemble methods, multi-objective representation learning, and adversarial training, often applied to architectures such as ResNets and Vision Transformers. Understanding and mitigating the threat of adversarial examples is crucial for ensuring the reliability and security of AI systems across diverse applications, from image classification and natural language processing to malware detection and autonomous driving. The development of robust defenses and effective attack detection methods remains a significant area of ongoing investigation.
Papers
Interpretability is a Kind of Safety: An Interpreter-based Ensemble for Adversary Defense
Jingyuan Wang, Yufan Wu, Mingxuan Li, Xin Lin, Junjie Wu, Chao Li
Generating Adversarial Examples with Better Transferability via Masking Unimportant Parameters of Surrogate Model
Dingcheng Yang, Wenjian Yu, Zihao Xiao, Jiaqi Luo
How many dimensions are required to find an adversarial example?
Charles Godfrey, Henry Kvinge, Elise Bishoff, Myles Mckay, Davis Brown, Tim Doster, Eleanor Byler
Adversarial Attack and Defense for Medical Image Analysis: Methods and Applications
Junhao Dong, Junxi Chen, Xiaohua Xie, Jianhuang Lai, Hao Chen