Adversarial Robustness
Adversarial robustness focuses on developing machine learning models resistant to adversarial attacks—small, carefully crafted input perturbations designed to cause misclassification. Current research investigates diverse defense mechanisms, including adversarial training, data purification using diffusion models, and biologically-inspired regularizers, often applied to convolutional neural networks (CNNs), transformers, and spiking neural networks (SNNs). This field is crucial for ensuring the reliability and safety of AI systems in real-world applications, particularly in safety-critical domains like autonomous driving and healthcare, where model failures can have severe consequences.
Papers
Adversarial Robustness of In-Context Learning in Transformers for Linear Regression
Usman Anwar, Johannes Von Oswald, Louis Kirsch, David Krueger, Spencer Frei
Game-Theoretic Defenses for Robust Conformal Prediction Against Adversarial Attacks in Medical Imaging
Rui Luo, Jie Bao, Zhixin Zhou, Chuangyin Dang