Erasure Method
Concept erasure methods aim to remove specific information from machine learning models, particularly large language models and text-to-image diffusion models, addressing concerns about safety, privacy, and copyright. Current research focuses on developing more effective and efficient erasure techniques, often employing targeted parameter updates, embedding manipulation, or attention mechanisms within model architectures like Stable Diffusion and GPT-J. These advancements are crucial for mitigating risks associated with unsafe content generation and ensuring responsible AI development, with implications for various applications including content moderation and data privacy.
Papers
January 3, 2025
January 2, 2025
December 28, 2024
December 22, 2024
December 9, 2024
October 21, 2024
October 9, 2024
October 3, 2024
September 26, 2024
September 1, 2024
August 22, 2024
August 2, 2024
July 17, 2024
July 16, 2024
April 30, 2024
April 4, 2024
March 18, 2024
March 12, 2024