LLM Unlearning

LLM unlearning focuses on removing specific information from large language models (LLMs) after training, addressing privacy and safety concerns stemming from memorization of sensitive data. Current research explores various methods, including gradient-based approaches, techniques leveraging "inverted facts" or prompt engineering, and those employing second-order optimization or orthogonal adapters for efficient and targeted unlearning. This field is crucial for responsible LLM deployment, impacting both the ethical development of AI and the practical application of LLMs in sensitive contexts requiring data protection.

Papers