Cross Lingual Backdoor Attack

Cross-lingual backdoor attacks exploit vulnerabilities in multilingual large language models (LLMs), aiming to trigger malicious outputs in multiple languages by poisoning training data in one or a few languages. Current research focuses on understanding the transferability of these attacks across different languages, investigating the susceptibility of various LLMs (including mT5, GPT-4, Llama series) to such attacks, and developing defense mechanisms against them, often involving embedding manipulation or translation-based techniques. These findings highlight significant security risks in multilingual LLMs and underscore the need for robust security measures to ensure the safe deployment of these increasingly prevalent technologies.

Papers