Paper ID: 2407.11785
Defining 'Good': Evaluation Framework for Synthetic Smart Meter Data
Sheng Chai, Gus Chadney, Charlot Avery, Phil Grunewald, Pascal Van Hentenryck, Priya L. Donti
Access to granular demand data is essential for the net zero transition; it allows for accurate profiling and active demand management as our reliance on variable renewable generation increases. However, public release of this data is often impossible due to privacy concerns. Good quality synthetic data can circumnavigate this issue. Despite significant research on generating synthetic smart meter data, there is still insufficient work on creating a consistent evaluation framework. In this paper, we investigate how common frameworks used by other industries leveraging synthetic data, can be applied to synthetic smart meter data, such as fidelity, utility and privacy. We also recommend specific metrics to ensure that defining aspects of smart meter data are preserved and test the extent to which privacy can be protected using differential privacy. We show that standard privacy attack methods like reconstruction or membership inference attacks are inadequate for assessing privacy risks of smart meter datasets. We propose an improved method by injecting training data with implausible outliers, then launching privacy attacks directly on these outliers. The choice of $\epsilon$ (a metric of privacy loss) significantly impacts privacy risk, highlighting the necessity of performing these explicit privacy tests when making trade-offs between fidelity and privacy.
Submitted: Jul 16, 2024