Synthetic Electronic Health Record

Synthetic electronic health records (EHRs) are generated to address privacy concerns while providing realistic data for research and development of healthcare applications. Current research focuses on generating high-fidelity data using generative adversarial networks (GANs), diffusion models, and transformer-based language models, with a strong emphasis on ensuring fairness and mitigating bias in downstream predictive tasks. This work is significant because it enables researchers to access and analyze large-scale EHR data without compromising patient privacy, facilitating advancements in machine learning for healthcare and improving the development of clinical decision support systems.

Papers