Multiple Choice Question
Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs), aiming to assess their knowledge, reasoning, and critical thinking abilities across diverse domains. Current research focuses on improving LLM performance on MCQs, exploring techniques like retrieval-augmented generation, fine-tuning with tailored demonstrations, and mitigating biases such as positional preferences and over-reliance on answer choices. This research is significant because robust and unbiased MCQ benchmarks are crucial for evaluating LLM capabilities and ensuring their reliable application in education, professional certification, and other high-stakes contexts.
Papers
June 18, 2024
June 11, 2024
June 7, 2024
June 4, 2024
June 3, 2024
May 30, 2024
May 28, 2024
May 20, 2024
May 6, 2024
April 29, 2024
April 27, 2024
April 20, 2024
April 19, 2024
April 8, 2024
April 4, 2024
April 2, 2024
March 26, 2024
March 19, 2024
March 15, 2024