Data Augmentation
Data augmentation is a technique used to artificially expand datasets by creating modified versions of existing data, primarily to improve the performance and robustness of machine learning models, especially when training data is scarce. Current research focuses on developing more sophisticated augmentation methods, including those leveraging generative models like GANs and diffusion models, and integrating augmentation with other techniques such as contrastive learning and transfer learning, often applied within architectures like transformers and convolutional neural networks. This work is significant because it addresses the limitations of limited datasets across various domains, from image classification and object detection to natural language processing and time series forecasting, leading to more accurate and generalizable models for diverse applications.
Papers
Improving Post-Earthquake Crack Detection using Semi-Synthetic Generated Images
Piercarlo Dondi, Alessio Gullotti, Michele Inchingolo, Ilaria Senaldi, Chiara Casarotti, Luca Lombardi, Marco Piastra
Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud
Yuanhao Yue, Chengyu Wang, Jun Huang, Peng Wang
Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Nozomu Masuya, Hiroshi Sato, Koki Yamane, Takuya Kusume, Sho Sakaino, Toshiaki Tsuji
Channel Reflection: Knowledge-Driven Data Augmentation for EEG-Based Brain-Computer Interfaces
Ziwei Wang, Siyang Li, Jingwei Luo, Jiajing Liu, Dongrui Wu
Curriculum-style Data Augmentation for LLM-based Metaphor Detection
Kaidi Jia, Yanxia Wu, Rongsheng Li
QA-TOOLBOX: Conversational Question-Answering for process task guidance in manufacturing
Ramesh Manuvinakurike, Elizabeth Watkins, Celal Savur, Anthony Rhodes, Sovan Biswas, Gesem Gudino Mejia, Richard Beckwith, Saurav Sahay, Giuseppe Raffa, Lama Nachman
GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing
Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood, Karthik Nandakumar, Naveed Akhtar
Evaluating the Impact of Data Augmentation on Predictive Model Performance
Valdemar Švábenský, Conrad Borchers, Elizabeth B. Cloude, Atsushi Shimada
Improving the performance of weak supervision searches using data augmentation
Zong-En Chen, Cheng-Wei Chiang, Feng-Yang Hsieh
Data Augmentation through Background Removal for Apple Leaf Disease Classification Using the MobileNetV2 Model
Youcef Ferdi
Topology-Preserving Scaling in Data Augmentation
Vu-Anh Le, Mehmet Dik
Generalizable Person Re-identification via Balancing Alignment and Uniformity
Yoonki Cho, Jaeyoon Kim, Woo Jae Kim, Junsik Jung, Sung-eui Yoon
Can Open-source LLMs Enhance Data Synthesis for Toxic Detection?: An Experimental Study
Zheng Hui, Zhaoxiao Guo, Hang Zhao, Juanyong Duan, Lin Ai, Yinheng Li, Julia Hirschberg, Congrui Huang