Manifold Augmentation

Manifold augmentation is a data augmentation technique that generates synthetic training data by leveraging the underlying geometric structure (manifold) of the data distribution. Current research focuses on developing manifold augmentation methods tailored to various data types (tabular, image, text) and learning paradigms (self-supervised, supervised), often employing transformer-based models or generative models operating in feature embedding spaces. These techniques aim to improve model generalization, robustness to out-of-distribution data, and performance in low-data regimes, impacting fields ranging from computer vision and natural language processing to medical diagnosis and time-series analysis.

Papers