New Benchmark
Recent research focuses on developing comprehensive benchmarks for evaluating large language models (LLMs) and other machine learning models across diverse tasks, including economic games, financial question answering, graph analysis, and robotic manipulation. These benchmarks aim to standardize evaluation methodologies, address issues like fairness and robustness, and quantify uncertainty in model performance, using various architectures such as transformers and graph neural networks. The resulting standardized evaluations and datasets are crucial for advancing the field by facilitating more rigorous comparisons of models and identifying areas needing improvement, ultimately leading to more reliable and effective AI systems across numerous applications.
Papers
Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method
Yixuan Li, Bolin Chen, Baoliang Chen, Meng Wang, Shiqi Wang, Weisi Lin
nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
Yiqun Yao, Siqi fan, Xiusheng Huang, Xuezhi Fang, Xiang Li, Ziyi Ni, Xin Jiang, Xuying Meng, Peng Han, Shuo Shang, Kang Liu, Aixin Sun, Yequan Wang
Few Shot Semantic Segmentation: a review of methodologies, benchmarks, and open challenges
Nico Catalano, Matteo Matteucci
Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results
Dong Wang, Jia Guo, Qiqi Shao, Haochi He, Zhian Chen, Chuanbao Xiao, Ajian Liu, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Jun Wan, Jiankang Deng
Rail Detection: An Efficient Row-based Network and A New Benchmark
Xinpeng Li, Xiaojiang Peng
Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
Yifeng Shi, Feng Lv, Xinliang Wang, Chunlong Xia, Shaojie Li, Shujie Yang, Teng Xi, Gang Zhang
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny
PlantDet: A benchmark for Plant Detection in the Three-Rivers-Source Region
Huanhuan Li, Xuechao Zou, Yu-an Zhang, Jiangcai Zhaba, Guomei Li, Lamao Yongga