Expert Tax Judge
Research on "expert tax judge" (or more broadly, LLMs as judges) focuses on developing and evaluating large language models (LLMs) capable of reliably assessing the quality of other LLMs' outputs, a crucial task given the rapid advancement of these models. Current research emphasizes mitigating biases (e.g., position bias, various social biases) within these "judge" LLMs, exploring diverse model architectures (including ensembles of smaller models) to improve accuracy and reduce costs, and developing robust evaluation metrics beyond simple agreement. This work is significant for advancing the trustworthiness and reliability of LLM evaluations, ultimately improving the development and deployment of LLMs across various applications.
Papers
November 5, 2024
October 17, 2024
October 16, 2024
October 11, 2024
September 30, 2024
June 18, 2024
June 12, 2024
April 29, 2024
February 16, 2024
January 12, 2024
November 15, 2023
August 4, 2023
June 7, 2023
June 2, 2023
May 9, 2023
May 5, 2023
March 23, 2023
December 21, 2022
October 15, 2022