Compositional Data Augmentation

Compositional data augmentation enhances machine learning models by generating synthetic training data through the combination or modification of existing examples, aiming to improve generalization to unseen compositions of features or concepts. Current research focuses on developing effective augmentation strategies, often leveraging techniques like mixing, substitution (e.g., of image components or text spans), and generative models to create diverse and informative augmented data. This approach addresses limitations of standard models in handling compositional tasks, particularly in areas like natural language processing, computer vision, and semantic parsing, leading to improved performance and robustness in various applications.

Papers