Synthetic Data
Synthetic data generation aims to create artificial datasets that mimic the statistical properties of real-world data, addressing limitations like data scarcity, privacy concerns, and high annotation costs. Current research focuses on developing sophisticated generative models, including generative adversarial networks (GANs), energy-based models (EBMs), diffusion models, and masked language models, tailored to various data types (images, text, tabular data, audio). This rapidly evolving field significantly impacts diverse scientific domains and practical applications by enabling the training of robust machine learning models in situations where real data is insufficient or ethically problematic, ultimately improving model performance and expanding research possibilities.
Papers
Learning Keypoints for Robotic Cloth Manipulation using Synthetic Data
Thomas Lips, Victor-Louis De Gusseme, Francis wyffels
Synthetic Data in AI: Challenges, Applications, and Ethical Implications
Shuang Hao, Wenfeng Han, Tao Jiang, Yiping Li, Haonan Wu, Chunlin Zhong, Zhangjun Zhou, He Tang
Adversarial Machine Learning-Enabled Anonymization of OpenWiFi Data
Samhita Kuili, Kareem Dabbour, Irtiza Hasan, Andrea Herscovich, Burak Kantarci, Marcel Chenier, Melike Erol-Kantarci
Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan-Hung Vu, Mickael Chen, Patrick Pérez, Matthieu Cord
Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
Liyuan Zhu, Shengyu Huang, Konrad Schindler, Iro Armeni
Dataset Distillation via Adversarial Prediction Matching
Mingyang Chen, Bo Huang, Junda Lu, Bing Li, Yi Wang, Minhao Cheng, Wei Wang
View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for Procedural Synthetic Data
Zeyu Ma, Alexander Raistrick, Lahav Lipson, Jia Deng
Challenges of YOLO Series for Object Detection in Extremely Heavy Rain: CALRA Simulator based Synthetic Evaluation Dataset
T. Kim, H. Jeon, Y. Lim
The Real Deal Behind the Artificial Appeal: Inferential Utility of Tabular Synthetic Data
Alexander Decruyenaere, Heidelinde Dehaene, Paloma Rabaey, Christiaan Polet, Johan Decruyenaere, Stijn Vansteelandt, Thomas Demeester