Adversarial Testing
Adversarial testing rigorously probes the robustness of machine learning models, particularly large language models (LLMs) and deep learning systems for computer vision, by subjecting them to carefully crafted inputs designed to elicit failures or biases. Current research focuses on developing automated adversarial attack methods, such as generative agents and single-turn crescendo attacks, and improving defenses through techniques like conformal prediction and robust training. This work is crucial for ensuring the safety and reliability of AI systems across diverse applications, from autonomous vehicles to medical diagnosis, by identifying and mitigating vulnerabilities before deployment.
Papers
December 29, 2024
October 30, 2024
October 17, 2024
October 2, 2024
September 4, 2024
July 12, 2024
July 10, 2024
May 18, 2024
April 25, 2024
April 6, 2024
March 14, 2024
March 2, 2024
February 20, 2024
February 19, 2024
February 13, 2024
December 30, 2023
November 14, 2023
August 22, 2023
June 21, 2023