Generative Data Augmentation
Generative data augmentation uses artificial intelligence models to create synthetic training data, addressing the limitations of small or biased datasets in various machine learning applications. Current research focuses on leveraging diffusion models, variational autoencoders (VAEs), generative adversarial networks (GANs), and large language models (LLMs) to generate high-quality, diverse, and semantically meaningful synthetic data, often incorporating techniques like classifier-free guidance and controllable generation. This approach significantly impacts fields like medical image analysis, robotics, and natural language processing by improving model performance, particularly in low-resource settings, and enabling more robust and generalizable models.
Papers
Simple and Effective Synthesis of Indoor 3D Scenes
Jing Yu Koh, Harsh Agrawal, Dhruv Batra, Richard Tucker, Austin Waters, Honglak Lee, Yinfei Yang, Jason Baldridge, Peter Anderson
DAGAM: Data Augmentation with Generation And Modification
Byeong-Cheol Jo, Tak-Sung Heo, Yeongjoon Park, Yongmin Yoo, Won Ik Cho, Kyungsun Kim