Multiple Choice Question
Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs), aiming to assess their knowledge, reasoning, and critical thinking abilities across diverse domains. Current research focuses on improving LLM performance on MCQs, exploring techniques like retrieval-augmented generation, fine-tuning with tailored demonstrations, and mitigating biases such as positional preferences and over-reliance on answer choices. This research is significant because robust and unbiased MCQ benchmarks are crucial for evaluating LLM capabilities and ensuring their reliable application in education, professional certification, and other high-stakes contexts.
Papers
January 10, 2025
January 6, 2025
January 3, 2025
December 13, 2024
December 10, 2024
December 2, 2024
October 23, 2024
October 21, 2024
October 17, 2024
September 25, 2024
September 19, 2024
September 13, 2024
August 27, 2024
August 22, 2024
August 17, 2024
July 23, 2024
July 7, 2024
July 2, 2024
June 21, 2024
June 19, 2024