Multiple Choice Question
Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs), aiming to assess their knowledge, reasoning, and critical thinking abilities across diverse domains. Current research focuses on improving LLM performance on MCQs, exploring techniques like retrieval-augmented generation, fine-tuning with tailored demonstrations, and mitigating biases such as positional preferences and over-reliance on answer choices. This research is significant because robust and unbiased MCQ benchmarks are crucial for evaluating LLM capabilities and ensuring their reliable application in education, professional certification, and other high-stakes contexts.
Papers
October 23, 2024
October 21, 2024
October 17, 2024
September 25, 2024
September 19, 2024
September 13, 2024
August 27, 2024
August 22, 2024
August 17, 2024
July 23, 2024
July 7, 2024
July 2, 2024
June 21, 2024
June 19, 2024
June 18, 2024
June 11, 2024
June 7, 2024
June 4, 2024
June 3, 2024