Insertion Based
Insertion-based attacks exploit vulnerabilities in machine learning models by injecting malicious content (e.g., text, audio, code patches) during training or inference, causing unintended model behavior. Current research focuses on developing robust defenses, often employing techniques like randomized smoothing, logit analysis, and attribution-based methods to identify and mitigate these attacks across various model architectures, including deep neural networks and transformer models. This research is crucial for enhancing the security and reliability of machine learning systems in diverse applications, ranging from malware detection to natural language processing, where the integrity of model predictions is paramount.
Papers
May 1, 2024
March 25, 2024
December 15, 2023
July 31, 2023
May 25, 2023
May 19, 2023