Deep Leakage

Deep leakage refers to the unintended exposure of sensitive information during machine learning model training and deployment, compromising data privacy and potentially leading to inaccurate or biased results. Current research focuses on identifying and mitigating leakage in various contexts, including federated learning (where gradient and model weight transmission are analyzed), code generation (examining contamination of evaluation datasets), and general machine learning pipelines (addressing issues like confounders and biased data). Understanding and addressing deep leakage is crucial for ensuring the reliability and trustworthiness of machine learning systems across diverse applications, from healthcare to cybersecurity.

Papers