Robustness Evaluation
Robustness evaluation assesses the reliability and stability of machine learning models under various perturbations and unexpected inputs, aiming to ensure their safe and effective deployment in real-world applications. Current research focuses on developing comprehensive benchmarks and metrics to evaluate robustness across diverse domains, including natural language processing, computer vision, and reinforcement learning, often employing adversarial attacks and data augmentation techniques to stress-test models. This field is crucial for building trustworthy AI systems, as robust models are less susceptible to errors and failures caused by noisy data, adversarial attacks, or unexpected environmental conditions, ultimately improving the safety and reliability of AI-driven technologies.
Papers
Towards Class-wise Robustness Analysis
Tejaswini Medi, Julia Grabinski, Margret Keuper
Risk-Averse Certification of Bayesian Neural Networks
Xiyue Zhang, Zifan Wang, Yulong Gao, Licio Romao, Alessandro Abate, Marta Kwiatkowska
SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks
Kim-Celine Kahl, Selen Erkan, Jeremias Traub, Carsten T. Lüth, Klaus Maier-Hein, Lena Maier-Hein, Paul F. Jaeger