Benchmark Platform
Benchmark platforms in various scientific domains aim to provide standardized evaluations of models and algorithms, enabling fair comparisons and driving research progress. Current research focuses on developing comprehensive benchmarks across diverse areas, including natural language processing, computer vision, robotics, and healthcare, often incorporating novel model architectures like large language models and deep learning frameworks. These platforms are crucial for advancing the field by facilitating reproducible research, identifying limitations of existing methods, and ultimately leading to more robust and reliable systems with real-world applications. The resulting insights inform the development of improved algorithms and contribute to a more rigorous and transparent scientific process.
Papers
Design and Benchmarking of A Multi-Modality Sensor for Robotic Manipulation with GAN-Based Cross-Modality Interpretation
Dandan Zhang, Wen Fan, Jialin Lin, Haoran Li, Qingzheng Cong, Weiru Liu, Nathan F. Lepora, Shan Luo
Benchmarking Large and Small MLLMs
Xuelu Feng, Yunsheng Li, Dongdong Chen, Mei Gao, Mengchen Liu, Junsong Yuan, Chunming Qiao
Benchmarking Constraint-Based Bayesian Structure Learning Algorithms: Role of Network Topology
Radha Nagarajan, Marco Scutari
MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception
Xiaoshuai Hao, Guanqun Liu, Yuting Zhao, Yuheng Ji, Mengchuan Wei, Haimei Zhao, Lingdong Kong, Rong Yin, Yu Liu
Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning
Yingjie Zhu, Xuefeng Bai, Kehai Chen, Yang Xiang, Min Zhang
SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation
Kazuki Shimada, Christian Simon, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu, Huiyu Duan, Qiang Hu, Liu Yang, Chunlei Cai, Tianxiao Ye, Huayu Liu, Xiaoyun Zhang, Guangtao Zhai
Benchmarking and Understanding Compositional Relational Reasoning of LLMs
Ruikang Ni, Da Xiao, Qingye Meng, Xiangyu Li, Shihui Zheng, Hongliang Liang
ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings
Rixin Zhou, Honglin Pang, Qian Zhang, Ruihua Qi, Xi Yang, Chuntao Li
Beacon: A Naturalistic Driving Dataset During Blackouts for Benchmarking Traffic Reconstruction and Control
Supriya Sarker, Iftekharul Islam, Bibek Poudel, Weizi Li
Benchmarking learned algorithms for computed tomography image reconstruction tasks
Maximilian B. Kiss, Ander Biguri, Zakhar Shumaylov, Ferdia Sherry, K. Joost Batenburg, Carola-Bibiane Schönlieb, Felix Lucka
LCFO: Long Context and Long Form Output Dataset and Benchmarking
Marta R. Costa-jussà, Pierre Andrews, Mariano Coria Meglioli, Joy Chen, Joe Chuang, David Dale, Christophe Ropers, Alexandre Mourachko, Eduardo Sánchez, Holger Schwenk, Tuan Tran, Arina Turkatenko, Carleigh Wood
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions
Mohammadmostafa Rostamkhani, Baktash Ansari, Hoorieh Sabzevari, Farzan Rahmani, Sauleh Eetemadi
MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler
Diederick Vermetten, Jeroen Rook, Oliver L. Preuß, Jacob de Nobel, Carola Doerr, Manuel López-Ibañez, Heike Trautmann, Thomas Bäck
Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments
Muhayy Ud Din, Ahsan B. Bakht, Waseem Akram, Yihao Dong, Lakmal Seneviratne, Irfan Hussain