New Attack
Research on attacks against large language models (LLMs) and related AI systems is rapidly expanding, focusing on vulnerabilities exploited to elicit harmful outputs or extract sensitive information. Current efforts concentrate on developing and evaluating various attack methods, including jailbreaking, data poisoning, prompt injection, and membership inference attacks, often targeting specific model architectures like transformer-based LLMs and diffusion models. This research is crucial for understanding and mitigating the risks associated with increasingly powerful AI systems, informing the development of more robust and trustworthy AI applications across diverse sectors.
Papers
January 4, 2025
January 2, 2025
December 30, 2024
December 29, 2024
December 23, 2024
December 21, 2024
December 18, 2024
December 17, 2024
December 15, 2024
December 14, 2024
December 12, 2024
December 2, 2024
November 5, 2024
November 3, 2024
November 2, 2024
October 31, 2024
October 26, 2024
October 17, 2024