Fine Tuned Judge Model
Fine-tuned judge models are large language models (LLMs) trained to evaluate the quality of other LLMs' outputs, aiming to improve the reliability and efficiency of LLM evaluation. Current research focuses on mitigating biases (e.g., length bias, position bias) in these judges through techniques like contrastive training, calibration, and the use of de-biased datasets, often employing generative models that provide interpretable rationales for their judgments. This work is significant because it addresses the critical need for robust and unbiased evaluation methods for LLMs, impacting both the development of more reliable AI systems and the advancement of LLM research itself.
Papers
October 21, 2024
October 16, 2024
October 14, 2024
October 7, 2024
October 1, 2024
September 25, 2024
September 23, 2024
August 20, 2024
July 9, 2024
July 5, 2024
June 17, 2024
June 13, 2024
June 12, 2024
May 12, 2024
March 25, 2024
March 5, 2024
February 1, 2024
January 29, 2024
October 26, 2023