Massive Multitask Language Understanding
Massive Multitask Language Understanding (MMLU) research aims to evaluate the breadth and depth of knowledge and reasoning capabilities in large language models (LLMs) across diverse domains. Current research focuses on developing more robust and challenging benchmarks like MMLU-Pro and its variants, addressing issues like shortcut learning, answer order bias, and data contamination to obtain more reliable performance metrics. These efforts are crucial for improving LLM development and ensuring responsible deployment, impacting both the scientific understanding of AI and the practical application of LLMs in various fields.
Papers
October 28, 2024
September 3, 2024
August 19, 2024
August 1, 2024
July 17, 2024
June 27, 2024
June 17, 2024
June 15, 2024
June 6, 2024
June 3, 2024
May 28, 2024
May 24, 2024
May 7, 2024
April 12, 2024
February 20, 2024
February 18, 2024
November 16, 2023
June 15, 2023