Unbiased Evaluation
Unbiased evaluation aims to develop methods for assessing the performance of machine learning models without the influence of inherent biases in data or evaluation metrics. Current research focuses on addressing biases in various domains, including anomaly detection, question answering, large language model ranking, and recommender systems, employing techniques like adjusted scoring metrics, debiasing data splits, and sample-efficient human evaluation strategies. These efforts are crucial for ensuring the reliability and fairness of machine learning systems across diverse applications, improving the trustworthiness of model performance assessments and ultimately leading to more robust and equitable AI.
Papers
November 13, 2024
September 19, 2024
April 29, 2024
April 10, 2024
March 9, 2024
July 2, 2023
June 16, 2023
April 17, 2023
June 28, 2022