Generative AI Safety
Generative AI safety research focuses on mitigating the risks of harmful outputs from powerful models like large language models (LLMs) and text-to-image diffusion models. Current efforts concentrate on developing techniques like fine-grained content moderation (e.g., token-level redaction), probabilistic risk assessment frameworks for copyright and other legal issues, and methods to improve model robustness against adversarial attacks and "jailbreaking" attempts. This field is crucial for responsible AI development, impacting both the trustworthiness of AI systems and the establishment of effective safety guidelines and regulations.
Papers
November 15, 2024
October 26, 2024
October 16, 2024
October 3, 2024
October 1, 2024
September 26, 2024
September 22, 2024
August 1, 2024
July 24, 2024
July 10, 2024
June 25, 2024
November 25, 2023