Augmentation Pipeline
Data augmentation pipelines are designed to enhance the performance of machine learning models, particularly when training data is limited or imbalanced. Current research focuses on optimizing these pipelines through automated search algorithms, leveraging generative models like GANs and Stable Diffusion to synthesize new data, and employing techniques like counterfactual learning to create more robust representations. These advancements are significantly impacting various fields, improving the accuracy and efficiency of models in medical image analysis, natural language processing, and other applications where data scarcity is a major challenge. The development of efficient and effective augmentation strategies is crucial for advancing the capabilities of machine learning across diverse domains.
Papers
SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models
Harsh Goel, Sai Shankar Narasimhan, Oguzhan Akcin, Sandeep Chinchali
CIA: Controllable Image Augmentation Framework Based on Stable Diffusion
Mohamed Benkedadra, Dany Rimez, Tiffanie Godelaine, Natarajan Chidambaram, Hamed Razavi Khosroshahi, Horacio Tellez, Matei Mancas, Benoit Macq, Sidi Ahmed Mahmoudi