Task Agnostic Backdoor

Task-agnostic backdoors are malicious modifications embedded in machine learning models, particularly large language models and vision transformers, that trigger unintended behavior regardless of the specific task the model is performing. Current research focuses on developing both sophisticated attack methods, often leveraging minimal data poisoning or parameter-efficient fine-tuning, and robust defenses against these attacks, exploring techniques like modifying loss functions or manipulating model embeddings. The widespread use of pre-trained models and the increasing reliance on continual learning highlight the critical need for effective defenses against this threat, impacting the security and reliability of numerous AI applications.

Papers