Flip Attack

Flip attacks exploit the vulnerability of machine learning models to minor data manipulations, aiming to degrade their performance or inject malicious behavior. Current research focuses on developing efficient flip attacks targeting various model architectures, including large language models (LLMs), graph neural networks (GNNs), and deep neural networks (DNNs), often employing techniques like bit flipping or label alteration. These attacks highlight critical security risks in deploying machine learning systems, particularly in safety-critical applications, and drive the development of robust defenses against data manipulation.

Papers