Invisible Attack

Invisible attacks target machine learning models by manipulating input data or model parameters in ways undetectable by standard methods, aiming to compromise model accuracy or induce harmful outputs. Current research focuses on developing both attack strategies, such as embedding malicious triggers in seemingly benign data using techniques like wavelet transforms, and robust defenses, including circuit breakers that interrupt harmful model responses and one-class classifiers that identify deviations from normal behavior. This field is crucial for ensuring the safety and reliability of AI systems across diverse applications, from autonomous driving to cybersecurity, where the consequences of undetected attacks can be severe.

Papers