External Validation
External validation in various fields focuses on rigorously assessing the performance and generalizability of models beyond their initial training data. Current research emphasizes robust validation strategies, often employing techniques like k-fold cross-validation, weighted importance sampling, and external test sets from diverse sources to ensure reliable performance across different contexts. This is crucial for building trust in AI systems across diverse applications, from medical diagnosis and prognosis to autonomous vehicle control and software testing, ultimately improving the reliability and impact of these technologies. The development of standardized validation frameworks and benchmarks is a growing trend, aiming to enhance reproducibility and comparability of results across studies.
Papers
Validation of the Scientific Literature via Chemputation Augmented by Large Language Models
Sebastian Pagel, Michael Jirasek, Leroy Cronin
Viscoelasticity Estimation of Sports Prosthesis by Energy-minimizing Inverse Kinematics and Its Validation by Forward Dynamics
Yuta Shimane, Taiki Ishigaki, Sunghee Kim, Ko Yamamoto
Computational Pathology for Accurate Prediction of Breast Cancer Recurrence: Development and Validation of a Deep Learning-based Tool
Ziyu Su, Yongxin Guo, Robert Wesolowski, Gary Tozbikian, Nathaniel S. O'Connell, M. Khalid Khan Niazi, Metin N. Gurcan
Towards Accountable AI-Assisted Eye Disease Diagnosis: Workflow Design, External Validation, and Continual Learning
Qingyu Chen, Tiarnan D L Keenan, Elvira Agron, Alexis Allot, Emily Guan, Bryant Duong, Amr Elsawy, Benjamin Hou, Cancan Xue, Sanjeeb Bhandari, Geoffrey Broadhead, Chantal Cousineau-Krieger, Ellen Davis, William G Gensheimer, David Grasic, Seema Gupta, Luis Haddock, Eleni Konstantinou, Tania Lamba, Michele Maiberger, Dimosthenis Mantopoulos, Mitul C Mehta, Ayman G Nahri, Mutaz AL-Nawaflh, Arnold Oshinsky, Brittany E Powell, Boonkit Purt, Soo Shin, Hillary Stiefel, Alisa T Thavikulwat, Keith James Wroblewski, Tham Yih Chung, Chui Ming Gemmy Cheung, Ching-Yu Cheng, Emily Y Chew, Michelle R. Hribar, Michael F. Chiang, Zhiyong Lu
VERA: Validation and Evaluation of Retrieval-Augmented Systems
Tianyu Ding, Adi Banerjee, Laurent Mombaerts, Yunhong Li, Tarik Borogovac, Juan Pablo De la Cruz Weinstein
A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks
Boa Jang, Youngbin Ahn, Eun Kyung Choe, Chang Ki Yoon, Hyuk Jin Choi, Young-Gon Kim