Annotator Rating

Annotator rating, the process of aggregating human judgments for tasks like evaluating text or images, is crucial for training and evaluating machine learning models, especially in subjective domains. Current research focuses on understanding and mitigating annotator disagreement, including developing models that predict individual ratings based on demographic information and online behavior, and exploring alternative aggregation methods beyond simple majority voting to better reflect nuanced opinions. These efforts aim to improve the reliability and validity of human-based evaluations, leading to more accurate and robust machine learning systems across various applications, such as natural language processing and machine translation.

Papers