Data Auditing

Data auditing in machine learning focuses on verifying the ethical and responsible use of data in model training, aiming to detect unauthorized or biased data usage. Current research emphasizes developing methods to identify data provenance, particularly within federated learning and for various model types (including image classifiers and language models), often employing membership inference attacks and techniques from information theory and causal inference. These auditing techniques are crucial for ensuring model fairness, accountability, and transparency, impacting both the development of responsible AI and the legal compliance of data-driven applications.

Papers