Harmful Unlearning

Harmful unlearning, also known as machine unlearning, aims to remove specific data or knowledge from trained machine learning models, particularly large language models (LLMs), without complete retraining. Current research focuses on developing effective unlearning algorithms, often employing techniques like gradient-based methods, knowledge distillation, and adversarial training, across various model architectures including LLMs and diffusion models. This field is crucial for addressing privacy concerns, mitigating biases, and enhancing the safety and robustness of AI systems, impacting both data protection regulations and the trustworthiness of AI applications.

Papers

January 30, 2024

CaMU: Disentangling Causal Effects in Deep Model Unlearning
Shaofei Shen, Chenhao Zhang, Alina Bialkowski, Weitong Chen, Miao Xu
Causal Effect Machine Unlearning Unlearning Framework Harmful Unlearning Causal Influence

January 29, 2024

Blockchain-enabled Trustworthy Federated Unlearning
Yijing Lin, Zhipeng Gao, Hongyang Du, Jinke Ren, Zhiqiang Xie, Dusit Niyato
Unlearning Framework Harmful Unlearning Federated Unlearning

January 26, 2024

Unlearning Traces the Influential Training Data of Language Models
Masaru Isonuma, Ivan Titov
Language Model Training Data Harmful Unlearning Unlearned Model Training Datasets Training Data Influence

January 19, 2024

Communication Efficient and Provable Federated Unlearning
Youming Tao, Cheng-Long Wang, Miao Pan, Dongxiao Yu, Xiuzhen Cheng, Di Wang
Harmful Unlearning Communication Efficiency Unlearned Model Federated Unlearning Level Unlearning

January 17, 2024

Attack and Reset for Unlearning: Exploiting Adversarial Noise toward Machine Unlearning through Parameter Re-initialization
Yoonhwa Jung, Ikhyun Cho, Shun-Hsiang Hsu, Julia Hockenmaier
Deep Learning Model Machine Unlearning Unlearning Framework Harmful Unlearning Adversarial Noise Global Reset Feature Parameter Re Initialization

January 11, 2024

TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter
Large Language Model Medical LLM Large Corpus Related Task Machine Unlearning Harmful Unlearning

December 26, 2023

Reinforcement Unlearning
Dayong Ye, Tianqing Zhu, Congcong Zhu, Derui Wang, Kun Gao, Zewei Shi, Sheng Shen, Wanlei Zhou, Minhui Xue
Machine Unlearning Harmful Unlearning Knowledge Unlearning

December 22, 2023

On the Effectiveness of Unlearning in Session-Based Recommendation
Xin Xin, Liu Yang, Ziqi Zhao, Pengjie Ren, Zhumin Chen, Jun Ma, Zhaochun Ren
Unlearning Framework Harmful Unlearning Session Based Recommendation Session Based

December 12, 2023

FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs
Swanand Ravindra Kadhe, Anisa Halimi, Ambrish Rawat, Nathalie Baracaldo
Large Language Model Procedural Fairness Bias Mitigation Unlearning Framework Harmful Unlearning Fairness Improvement Fair Predictor Ensemble Post Processing

December 7, 2023

Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection
Tuan Hoang, Santu Rana, Sunil Gupta, Svetha Venkatesh
Deep Neural Network Stochastic Gradient Descent Machine Unlearning Unlearning Framework Harmful Unlearning Knowledge Unlearning Gradient Projection

December 4, 2023

DUCK: Distance-based Unlearning via Centroid Kinematics
Marco Cotogni, Jacopo Bonato, Luigi Sabetta, Francesco Pelosin, Alessandro Nicolosi
Machine Unlearning Unlearning Framework Harmful Unlearning Centroidal Dynamic

November 26, 2023

Unlearning via Sparse Representations
Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal
Knowledge Distillation Machine Unlearning Harmful Unlearning Sparse Representation Representation Bottleneck Zero Shot Unlearning

October 31, 2023

Unlearn What You Want to Forget: Efficient Unlearning for LLMs
Jiaao Chen, Diyi Yang
Large Language Model Unlearning Framework Harmful Unlearning Generation Task Data Removal

October 14, 2023

Large Language Model Unlearning
Yuanshun Yao, Xiaojun Xu, Yang Liu
Large Language Model Harmful Unlearning Negative Sample Alignment Performance LLM Unlearning

October 9, 2023

Unlearning with Fisher Masking
Yufang Liu, Changzhi Sun, Yuanbin Wu, Aimin Zhou
Training Data Harmful Unlearning Masking Strategy Prototype Mask

September 19, 2023

FRAMU: Attention-based Machine Unlearning using Federated Reinforcement Learning
Thanveer Shaik, Xiaohui Tao, Lin Li, Haoran Xie, Taotao Cai, Xiaofeng Zhu, Qing Li
Machine Unlearning Harmful Unlearning Federated Reinforcement Learning Federated Learning Application

July 19, 2023

What can we learn from Data Leakage and Unlearning for Law?
Jaydeep Borkar
Training Data Domain Specific Pre Training Legal Text Harmful Unlearning Data Leakage

June 27, 2023

Ticketed Learning-Unlearning Schemes
Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang
Unlearning Framework Harmful Unlearning Learning Algorithm Concept Class

May 21, 2023

Random Relabeling for Efficient Machine Unlearning
Junde Li, Swaroop Ghosh
Machine Unlearning Unlearning Framework Harmful Unlearning Random Label Data Removal

May 10, 2023

Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy
Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, Qing Li
Machine Learning Training Data Machine Learning Model Comprehensive Survey Machine Unlearning Comprehensive Taxonomy Harmful Unlearning Landscape Image Training Model