Label Only Model Inversion Attack

Label-only model inversion attacks exploit the vulnerability of machine learning models by reconstructing training data using only the model's predicted labels—the least amount of information possible. Current research focuses on developing effective attack methods, often employing generative models like conditional diffusion models or leveraging knowledge transfer from surrogate models to overcome the limited information available. This area is crucial for understanding and mitigating privacy risks in machine learning deployments, as it demonstrates that even minimal model outputs can reveal sensitive training data.

Papers