Labeling Source
Labeling source research focuses on efficiently generating training data for machine learning models by leveraging "weak" labels—inexpensive, noisy, or incomplete annotations—instead of relying solely on expensive manual labeling. Current research explores methods to integrate multiple weak sources, including rule-based systems, pre-trained models, and crowd-sourced data, often employing techniques like generative models (e.g., normalizing flows) or adapting existing models to handle weak supervision. This work is significant because it addresses the bottleneck of data annotation in many machine learning applications, enabling the development of accurate models even with limited high-quality labeled data.
Papers
October 5, 2022
August 7, 2022
August 2, 2022
June 21, 2022
April 28, 2022