Adversarial Example
Adversarial examples are subtly altered inputs designed to fool machine learning models, primarily deep neural networks (DNNs), into making incorrect predictions. Current research focuses on improving model robustness against these attacks, exploring techniques like ensemble methods, multi-objective representation learning, and adversarial training, often applied to architectures such as ResNets and Vision Transformers. Understanding and mitigating the threat of adversarial examples is crucial for ensuring the reliability and security of AI systems across diverse applications, from image classification and natural language processing to malware detection and autonomous driving. The development of robust defenses and effective attack detection methods remains a significant area of ongoing investigation.
Papers
An Evolutionary, Gradient-Free, Query-Efficient, Black-Box Algorithm for Generating Adversarial Instances in Deep Networks
Raz Lapid, Zvika Haramaty, Moshe Sipper
Two Heads are Better than One: Robust Learning Meets Multi-branch Models
Dong Huang, Qingwen Bu, Yuhao Qing, Haowen Pi, Sen Wang, Heming Cui
$p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of Deep Representations
Adam Dziedzic, Stephan Rabanser, Mohammad Yaghini, Armin Ale, Murat A. Erdogdu, Nicolas Papernot
SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness
Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr