Data Extraction Attack

Data extraction attacks exploit the tendency of large language models (LLMs), including transformer-based architectures like BERT and GPT, to memorize portions of their training data, enabling adversaries to extract sensitive information. Current research focuses on improving attack techniques, such as "neural phishing" and methods leveraging model unlearning or pseudo-labeling to amplify data leakage, as well as developing defenses like differentially private prediction. This area is crucial due to the widespread use of LLMs trained on potentially sensitive datasets, with implications for data privacy and the legal ramifications of using such models.

Papers