Labeling Function

Labeling functions are programmatic rules used in weak supervision to automatically label data, mitigating the high cost and effort of manual annotation, particularly valuable in domains with scarce labeled data like medicine and natural language processing. Current research focuses on improving the generation and evaluation of these functions, leveraging large language models to automate their creation and employing techniques like Shapley values to assess their individual contributions and refine their overall performance. This approach significantly accelerates the training of machine learning models, impacting various fields by enabling the use of larger, more diverse datasets and improving the efficiency of model development.

Papers