Data Augmentation
Data augmentation is a technique used to artificially expand datasets by creating modified versions of existing data, primarily to improve the performance and robustness of machine learning models, especially when training data is scarce. Current research focuses on developing more sophisticated augmentation methods, including those leveraging generative models like GANs and diffusion models, and integrating augmentation with other techniques such as contrastive learning and transfer learning, often applied within architectures like transformers and convolutional neural networks. This work is significant because it addresses the limitations of limited datasets across various domains, from image classification and object detection to natural language processing and time series forecasting, leading to more accurate and generalizable models for diverse applications.
Papers
Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models
Hyun-Jic Oh, Won-Ki Jeong
Shape and Style GAN-based Multispectral Data Augmentation for Crop/Weed Segmentation in Precision Farming
Mulham Fawakherji, Vincenzo Suriani, Daniele Nardi, Domenico Daniele Bloisi
Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis
Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain
LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking
Amy Xin, Yunjia Qi, Zijun Yao, Fangwei Zhu, Kaisheng Zeng, Xu Bin, Lei Hou, Juanzi Li