Textual Adversarial Example
Textual adversarial examples are subtly altered text inputs designed to deceive natural language processing (NLP) models, highlighting vulnerabilities in their robustness. Current research focuses on developing more effective attack methods, often employing techniques like synonym substitution and phrase-level manipulation within various model architectures, including BERT and other large language models (LLMs), and exploring defenses such as test-time adaptation and manifold-based approaches. Understanding and mitigating these vulnerabilities is crucial for ensuring the reliability and security of NLP systems in real-world applications, particularly in safety-critical domains.
Papers
August 15, 2024
May 24, 2024
February 26, 2024
February 5, 2024
February 2, 2024
October 29, 2023
July 3, 2023
May 6, 2023
April 18, 2023
March 9, 2023
March 1, 2023
November 5, 2022
October 19, 2022
July 21, 2022
May 22, 2022
April 29, 2022
April 10, 2022
February 17, 2022
November 4, 2021