Adversarial Example
Adversarial examples are subtly altered inputs designed to fool machine learning models, primarily deep neural networks (DNNs), into making incorrect predictions. Current research focuses on improving model robustness against these attacks, exploring techniques like ensemble methods, multi-objective representation learning, and adversarial training, often applied to architectures such as ResNets and Vision Transformers. Understanding and mitigating the threat of adversarial examples is crucial for ensuring the reliability and security of AI systems across diverse applications, from image classification and natural language processing to malware detection and autonomous driving. The development of robust defenses and effective attack detection methods remains a significant area of ongoing investigation.
Papers
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification
Sonal Joshi, Thomas Thebaud, Jesús Villalba, Najim Dehak
Pointing out the Shortcomings of Relation Extraction Models with Semantically Motivated Adversarials
Gennaro Nolano, Moritz Blum, Basil Ell, Philipp Cimiano
How to Train your Antivirus: RL-based Hardening through the Problem-Space
Ilias Tsingenopoulos, Jacopo Cortellazzi, Branislav Bošanský, Simone Aonzo, Davy Preuveneers, Wouter Joosen, Fabio Pierazzi, Lorenzo Cavallaro
MPAT: Building Robust Deep Neural Networks against Textual Adversarial Attacks
Fangyuan Zhang, Huichi Zhou, Shuangjiao Li, Hongtao Wang
Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense
Qiao Han, yong huang, xinling Guo, Yiteng Zhai, Yu Qin, Yao Yang