Information Leakage

Information leakage in machine learning models, particularly large language models (LLMs) and diffusion models, refers to the unintended exposure of sensitive training data through model outputs or gradients. Current research focuses on quantifying leakage risks across various architectures, including retrieval-augmented generation, mixture-of-experts models, and diffusion models, and developing mitigation strategies like knowledge sanitization and differential privacy. Understanding and addressing information leakage is crucial for ensuring the responsible development and deployment of AI systems, protecting user privacy, and maintaining public trust.

Papers