Data Splitting
Data splitting, the partitioning of datasets into training, validation, and testing subsets, is crucial for developing and evaluating machine learning models. Current research emphasizes developing splitting strategies that avoid data leakage and bias, particularly addressing challenges posed by non-IID data, temporal dependencies (as in time series or video data), and imbalanced class distributions. These improved splitting techniques, often coupled with advanced model architectures like transformers and physics-informed neural networks, aim to enhance model generalizability and reliability, leading to more robust and trustworthy machine learning applications across diverse fields.
Papers
November 27, 2024
November 22, 2024
October 22, 2024
October 8, 2024
September 13, 2024
June 13, 2024
June 12, 2024
February 22, 2024
January 30, 2024
January 15, 2024
November 16, 2023
July 26, 2023
June 29, 2023
June 14, 2023
May 17, 2023
April 27, 2023
April 21, 2023
March 23, 2023
December 17, 2022