Tabular Data Augmentation
Tabular data augmentation aims to improve the performance of machine learning models trained on limited or low-quality tabular datasets by artificially increasing the amount of training data. Current research focuses on generative models, including Generative Adversarial Networks (GANs) and Energy-Based Models (EBMs), often enhanced by Large Language Models (LLMs) to improve data quality and realism, as well as retrieval-based methods that leverage external data sources. This field is crucial because many real-world applications, particularly in domains with limited data collection capabilities (e.g., medicine), rely on accurate machine learning models, and data augmentation offers a powerful way to enhance their performance and reliability.