Optimal Split
Optimal splitting of datasets is crucial for reliable model training and evaluation in machine learning, aiming to minimize bias and accurately estimate generalization performance. Current research focuses on developing efficient algorithms for optimal splits in various contexts, including streaming data, federated learning (with methods like ESFL optimizing resource allocation across devices), and handling near-duplicates in datasets. These advancements improve model accuracy and robustness, particularly addressing challenges like bias detection and out-of-distribution generalization, ultimately leading to more reliable and trustworthy machine learning models across diverse applications.
Papers
March 28, 2024
February 24, 2024
April 10, 2023
January 21, 2023
September 23, 2022
September 4, 2022
April 28, 2022
March 1, 2022