Adversarial Text
Adversarial text research focuses on creating and defending against text inputs designed to deceive natural language processing (NLP) models, often by subtly altering wording while maintaining semantic similarity to a human reader. Current research emphasizes developing more effective attack methods, particularly those leveraging multi-agent systems, reinforcement learning, and diffusion models, as well as improving defenses through techniques like adversarial training and noise augmentation. This field is crucial for enhancing the robustness and trustworthiness of NLP systems across diverse applications, from automated essay scoring to autonomous vehicle navigation and large language model safety.
Papers
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao, Xiaojun Jia, Xuhong Ren, Ivor Tsang, Qing Guo
Electioneering the Network: Dynamic Multi-Step Adversarial Attacks for Community Canvassing
Saurabh Sharma, Ambuj SIngh