Tabular Data Synthesis

Tabular data synthesis aims to generate realistic synthetic datasets that preserve the statistical properties of real data while protecting privacy. Current research focuses on improving the quality and utility of synthetic data using various generative models, including Generative Adversarial Networks (GANs), diffusion models, and increasingly, large language models (LLMs), often incorporating techniques like conditional generation and differential privacy. This field is crucial for enabling data sharing and analysis in sensitive domains while mitigating privacy risks, impacting diverse applications from healthcare and finance to scientific research. A key challenge remains balancing the fidelity of synthetic data with its privacy-preserving properties.

Papers

October 8, 2022

STaSy: Score-based Tabular data Synthesis
Jayoung Kim, Chaejeong Lee, Noseong Park
Generative Model Score Based Generative Tabular Data Synthesis

August 17, 2022

An Empirical Study on the Membership Inference Attack against Tabular Data Synthesis Models
Jihyeon Hyeong, Jayoung Kim, Noseong Park, Sushil Jajodia
Differential Privacy Empirical Study Membership Inference Attack Tabular Data Synthesis

May 24, 2022

RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis
Mohammad Esmaeilpour, Nourhene Chaalia, Adel Abusitta, Francois-Xavier Devailly, Wissem Maazoun, Patrick Cardinal
Generative Model Generative Adversarial Network Adversarial Training Conditional Generative Tabular Data Synthesis

April 1, 2022

CTAB-GAN+: Enhancing Tabular Data Synthesis
Zilong Zhao, Aditya Kunar, Robert Birke, Lydia Y. Chen
Generative Adversarial Network GAN Model C Gan Tabular Data Synthesis Tabular Generative Adversarial Network GAN Baseline

February 8, 2022

Invertible Tabular GANs: Killing Two Birds with OneStone for Tabular Data Synthesis
Jaehoon Lee, Jihyeon Hyeong, Jinsung Jeon, Noseong Park, Jihoon Cho
Adversarial Training GAN Model Bird Specie Tabular Data Synthesis GAN Framework

Tabular Data Synthesis

Papers

STaSy: Score-based Tabular data Synthesis

An Empirical Study on the Membership Inference Attack against Tabular Data Synthesis Models

RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis

CTAB-GAN+: Enhancing Tabular Data Synthesis

Invertible Tabular GANs: Killing Two Birds with OneStone for Tabular Data Synthesis