Hijacking Task
Model hijacking attacks exploit vulnerabilities in machine learning models to force them to perform unintended tasks, posing significant security and privacy risks. Current research focuses on understanding and mitigating these attacks across various learning paradigms (centralized, federated, split learning) and modalities (image, text), employing techniques like pixel-level perturbations, encoder-decoder frameworks, and distance measures in latent spaces. This research is crucial for enhancing the trustworthiness and security of deployed machine learning systems, impacting both the development of robust models and the responsible use of AI in sensitive applications.
Papers
October 30, 2024
August 4, 2024
July 31, 2024
June 3, 2024
April 14, 2024
December 29, 2023
December 7, 2023