Adversarial Example
Adversarial examples are subtly altered inputs designed to fool machine learning models, primarily deep neural networks (DNNs), into making incorrect predictions. Current research focuses on improving model robustness against these attacks, exploring techniques like ensemble methods, multi-objective representation learning, and adversarial training, often applied to architectures such as ResNets and Vision Transformers. Understanding and mitigating the threat of adversarial examples is crucial for ensuring the reliability and security of AI systems across diverse applications, from image classification and natural language processing to malware detection and autonomous driving. The development of robust defenses and effective attack detection methods remains a significant area of ongoing investigation.
Papers
Adversarial Math Word Problem Generation
Roy Xie, Chengxuan Huang, Junlin Wang, Bhuwan Dhingra
Black-box Adversarial Attacks Against Image Quality Assessment Models
Yu Ran, Ao-Xiang Zhang, Mingjie Li, Weixuan Tang, Yuan-Gen Wang
Extreme Miscalibration and the Illusion of Adversarial Robustness
Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis
Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates
Daniele Angioni, Luca Demetrio, Maura Pintor, Luca Oneto, Davide Anguita, Battista Biggio, Fabio Roli
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Bo Yang, Hengwei Zhang, Jindong Wang, Yulong Yang, Chenhao Lin, Chao Shen, Zhengyu Zhao
Distilling Adversarial Robustness Using Heterogeneous Teachers
Jieren Deng, Aaron Palmer, Rigel Mahmood, Ethan Rathbun, Jinbo Bi, Kaleel Mahmood, Derek Aguiar
Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul Jennings, Xingyu Zhao
Query-Based Adversarial Prompt Generation
Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr
Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training
Leo Hyun Park, Jaeuk Kim, Myung Gyo Oh, Jaewoo Park, Taekyoung Kwon
Stealing the Invisible: Unveiling Pre-Trained CNN Models through Adversarial Examples and Timing Side-Channels
Shubhi Shukla, Manaar Alam, Pabitra Mitra, Debdeep Mukhopadhyay
Assessing biomedical knowledge robustness in large language models by query-efficient sampling attacks
R. Patrick Xian, Alex J. Lee, Satvik Lolla, Vincent Wang, Qiming Cui, Russell Ro, Reza Abbasi-Asl
Theoretical Understanding of Learning from Adversarial Perturbations
Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki