Synthetic Census

Synthetic census data generation aims to create realistic, privacy-preserving substitutes for real census microdata, addressing limitations in data access and privacy concerns. Current research focuses on developing advanced machine learning models, particularly Generative Adversarial Networks (GANs) and other methods like microsimulation, to generate high-fidelity synthetic datasets that accurately reflect the statistical properties of the original data while minimizing disclosure risk. This work is significant because it allows researchers broader access to valuable population data for various applications, such as fraud detection and real estate price prediction, while protecting individual privacy. The development of robust evaluation frameworks to compare the utility and risk of different synthetic data generation methods is also a key area of ongoing investigation.

Papers