Data Poisoning Attack
Data poisoning attacks involve injecting malicious data into training datasets to compromise the performance or behavior of machine learning models. Current research focuses on understanding the vulnerabilities of various model architectures, including linear solvers, federated learning systems, large language models, and clustering algorithms, to different poisoning strategies (e.g., backdoor attacks, label flipping, feature manipulation). This is a critical area of study because successful data poisoning attacks can severely undermine the reliability and trustworthiness of machine learning systems across numerous applications, from healthcare to autonomous vehicles, necessitating the development of robust defenses.
Papers
Fragile Giants: Understanding the Susceptibility of Models to Subpopulation Attacks
Isha Gupta, Hidde Lycklama, Emanuel Opel, Evan Rose, Anwar Hithnawi
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
Tingchen Fu, Mrinank Sharma, Philip Torr, Shay B. Cohen, David Krueger, Fazl Barez
PureEBM: Universal Poison Purification via Mid-Run Dynamics of Energy-Based Models
Omead Pooladzandi, Jeffrey Jiang, Sunay Bhat, Gregory Pottie
PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics
Sunay Bhat, Jeffrey Jiang, Omead Pooladzandi, Alexander Branch, Gregory Pottie