Trigger Word
Trigger words, specific words or phrases that elicit a strong response or action, are a focus of research across diverse fields. Current research investigates their impact on social media engagement and animosity, their use in manipulating text-to-image models, and their role in backdoor attacks on machine learning systems. Researchers employ various techniques, including sequence labeling, sparsity optimization, and contrastive learning, to detect and analyze trigger words, aiming to improve model robustness and understand their influence on human behavior and technology. This work has implications for mitigating online harms, enhancing the security of AI systems, and gaining a deeper understanding of human communication and bias.
Papers
August 10, 2024
May 16, 2024
February 12, 2024
November 23, 2023
October 5, 2023
August 29, 2023
June 25, 2023
November 6, 2021