Semantic Adversarial
Semantic adversarial attacks manipulate the semantic content of data, such as images or text, to create adversarial examples that fool machine learning models while appearing natural to humans. Current research focuses on developing more efficient and effective attack methods, often leveraging diffusion models, generative adversarial networks (GANs), and large language models (LLMs) to generate semantically consistent perturbations. This area is significant because it reveals vulnerabilities in machine learning systems and drives the development of more robust and reliable models, with implications for various applications including image recognition, natural language processing, and autonomous systems.
Papers
Robustness of Large Language Models Against Adversarial Attacks
Yiyi Tao, Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du
Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature
Yichen Wang, Yuxuan Chou, Ziqi Zhou, Hangtao Zhang, Wei Wan, Shengshan Hu, Minghui Li