Paper ID: 2302.09193
Copula-based transferable models for synthetic population generation
Pascal Jutras-Dubé, Mohammad B. Al-Khasawneh, Zhichao Yang, Javier Bas, Fabian Bastin, Cinzia Cirillo
Population synthesis involves generating synthetic yet realistic representations of a target population of micro-agents for behavioral modeling and simulation. Traditional methods, often reliant on target population samples, such as census data or travel surveys, face limitations due to high costs and small sample sizes, particularly at smaller geographical scales. We propose a novel framework based on copulas to generate synthetic data for target populations where only empirical marginal distributions are known. This method utilizes samples from different populations with similar marginal dependencies, introduces a spatial component into population synthesis, and considers various information sources for more realistic generators. Concretely, the process involves normalizing the data and treat it as realizations of a given copula, and then training a generative model before incorporating the information on the marginals of the target population. Utilizing American Community Survey data, we assess our framework's performance through standardized root mean squared error (SRMSE) and so-called sampled zeros. We focus on its capacity to transfer a model learned from one population to another. Our experiments include transfer tests between regions at the same geographical level as well as to lower geographical levels, hence evaluating the framework's adaptability in varied spatial contexts. We compare Bayesian Networks, Variational Autoencoders, and Generative Adversarial Networks, both individually and combined with our copula framework. Results show that the copula enhances machine learning methods in matching the marginals of the reference data. Furthermore, it consistently surpasses Iterative Proportional Fitting in terms of SRMSE in the transferability experiments, while introducing unique observations not found in the original training sample.
Submitted: Feb 17, 2023