Augmented Data
Augmented data techniques aim to improve the performance and robustness of machine learning models by artificially expanding training datasets. Current research focuses on developing sophisticated augmentation strategies, including generative models, contrastive learning, and adaptive augmentation methods, often integrated with various deep learning architectures like CNNs, transformers, and recurrent neural networks. These advancements address challenges such as data scarcity, class imbalance, and the need for improved model generalization across diverse conditions, impacting fields ranging from natural language processing and computer vision to robotics and autonomous driving. The ultimate goal is to create more accurate, reliable, and robust models with less reliance on extensive, expensive data collection.
Papers
DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning
Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler
People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection
Indira Sen, Dennis Assenmacher, Mattia Samory, Isabelle Augenstein, Wil van der Aalst, Claudia Wagner