Harmful Unlearning
Harmful unlearning, also known as machine unlearning, aims to remove specific data or knowledge from trained machine learning models, particularly large language models (LLMs), without complete retraining. Current research focuses on developing effective unlearning algorithms, often employing techniques like gradient-based methods, knowledge distillation, and adversarial training, across various model architectures including LLMs and diffusion models. This field is crucial for addressing privacy concerns, mitigating biases, and enhancing the safety and robustness of AI systems, impacting both data protection regulations and the trustworthiness of AI applications.
Papers
October 30, 2024
October 29, 2024
October 22, 2024
October 19, 2024
October 16, 2024
October 13, 2024
October 9, 2024
October 8, 2024
October 4, 2024
October 1, 2024
September 26, 2024
September 15, 2024
September 9, 2024
September 4, 2024
August 12, 2024
August 8, 2024
August 3, 2024
July 25, 2024
July 16, 2024