Benchmark Suite
Benchmark suites are collections of standardized datasets and evaluation protocols designed to rigorously assess the performance of machine learning models across diverse tasks. Current research focuses on developing comprehensive suites for various domains, including video understanding, log analysis, compiler autotuning, and natural language processing, often evaluating large language models and other deep learning architectures. These suites are crucial for fostering reproducible research, enabling fair comparisons of different models and algorithms, and ultimately driving progress in the development of more robust and reliable AI systems with improved generalization capabilities across diverse real-world applications.
Papers
November 15, 2023
October 6, 2023
September 29, 2023
September 15, 2023
July 20, 2023
May 24, 2023
May 20, 2023
January 11, 2023
December 20, 2022
November 3, 2022
October 13, 2022
July 20, 2022
June 12, 2022
June 8, 2022
April 25, 2022