Training Data Extraction

Training data extraction focuses on identifying and retrieving fragments of the datasets used to train large language models (LLMs), revealing potential vulnerabilities in data privacy and security. Current research explores various techniques, including prompting strategies (e.g., using special characters or carefully crafted instructions) and refined ranking methods to improve the accuracy and efficiency of data extraction. This research is crucial for assessing the security risks associated with LLMs and informing the development of more robust privacy-preserving training and deployment methods.

Papers