LLM Evaluator
LLM evaluators are large language models (LLMs) designed to assess the quality of text generated by other LLMs, addressing the high cost and subjectivity of human evaluation. Current research focuses on improving the accuracy and reliability of these evaluators by mitigating biases (e.g., position bias, token count bias, self-preference), enhancing alignment with human judgments, and exploring diverse architectures such as ensembles of smaller models or hierarchical decomposition of evaluation criteria. This field is crucial for advancing LLM development, enabling more objective benchmarking and facilitating the responsible deployment of LLMs across various applications.
Papers
April 1, 2024
March 25, 2024
March 21, 2024
February 25, 2024
February 24, 2024
February 21, 2024
December 4, 2023
November 14, 2023
October 30, 2023
October 11, 2023
September 29, 2023
September 24, 2023
September 9, 2023
August 3, 2023
June 15, 2023
May 21, 2023