Hijacking Task

Model hijacking attacks exploit vulnerabilities in machine learning models to force them to perform unintended tasks, posing significant security and privacy risks. Current research focuses on understanding and mitigating these attacks across various learning paradigms (centralized, federated, split learning) and modalities (image, text), employing techniques like pixel-level perturbations, encoder-decoder frameworks, and distance measures in latent spaces. This research is crucial for enhancing the trustworthiness and security of deployed machine learning systems, impacting both the development of robust models and the responsible use of AI in sensitive applications.

Papers