Global Evaluation
Global evaluation in various scientific domains focuses on developing robust and reliable methods for assessing the performance of models and systems, often addressing challenges in data diversity, evolving data distributions, and the need for human-centered metrics. Current research emphasizes the development of comprehensive benchmarks and evaluation frameworks, often incorporating techniques like Item Response Theory and multi-faceted metrics beyond simple accuracy, and utilizing diverse model architectures including Large Language Models (LLMs), Convolutional Neural Networks (CNNs), and Graph Neural Networks (GNNs). These advancements are crucial for ensuring the trustworthiness and effectiveness of AI systems across diverse applications, from medical diagnosis to autonomous driving, and for fostering reproducible and comparable research within the scientific community.
Papers
Evaluation of Xilinx Deep Learning Processing Unit under Neutron Irradiation
D. Agiakatsikas, N. Foutris, A. Sari, V. Vlagkoulis, I. Souvatzoglou, M. Psarakis, M. Luján, M. Kastriotou, C. Cazzaniga
Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction
Masakiyo Teranishi, Kazushi Tsutsui, Kazuya Takeda, Keisuke Fujii
AI-enabled Sound Pattern Recognition on Asthma Medication Adherence: Evaluation with the RDA Benchmark Suite
Nikos D. Fakotakis, Stavros Nousias, Gerasimos Arvanitis, Evangelia I. Zacharaki, Konstantinos Moustakas
A Review and Evaluation of Elastic Distance Functions for Time Series Clustering
Chris Holder, Matthew Middlehurst, Anthony Bagnall