Erasure Method
Concept erasure methods aim to remove specific information from machine learning models, particularly large language models and text-to-image diffusion models, addressing concerns about safety, privacy, and copyright. Current research focuses on developing more effective and efficient erasure techniques, often employing targeted parameter updates, embedding manipulation, or attention mechanisms within model architectures like Stable Diffusion and GPT-J. These advancements are crucial for mitigating risks associated with unsafe content generation and ensuring responsible AI development, with implications for various applications including content moderation and data privacy.
Papers
December 26, 2023
May 17, 2023
May 14, 2023
December 5, 2022
October 18, 2022
May 26, 2022