Data Enrichment

Data enrichment enhances datasets by adding information or synthetic samples to improve the performance of machine learning models, particularly when dealing with imbalanced or limited data. Current research focuses on techniques like data augmentation (e.g., image transformations, time series alterations), imputation of missing values, and the integration of external knowledge sources (e.g., named entity recognition, semantic cues). These methods are applied across diverse fields, from improving rare event prediction in manufacturing and enhancing person re-identification in computer vision to boosting the accuracy of clinical trial matching and legal question answering systems, ultimately leading to more robust and reliable AI applications.

Papers