Automatic Curation

Automatic curation focuses on using computational methods to efficiently organize, label, and enhance datasets, addressing the limitations of manual curation in terms of cost, time, and scalability. Current research emphasizes developing algorithms and models, including those based on transformers, diffusion models, and clustering techniques, to automate tasks such as data cleaning, annotation, and selection for various data types (text, images, videos). This automated approach is crucial for advancing machine learning across diverse fields, from biomedical research and scientific publishing to autonomous driving and public art curation, by providing high-quality, readily accessible datasets for training and evaluation.

Papers