Label Variation

Label variation, the inconsistency in assigning labels to data points across different annotators or models, is a significant challenge in machine learning, impacting model accuracy and robustness. Current research focuses on mitigating this variation through techniques like multi-task learning, adjusting model architectures to account for label distribution skew (e.g., using concatenated activations and logit adjustments), and incorporating label uncertainty into model training. Understanding and addressing label variation is crucial for improving the reliability and generalizability of machine learning models across diverse applications, particularly in scientific information extraction and natural language processing.

Papers