Training Data Distribution
Training data distribution research focuses on how the characteristics of data used to train machine learning models affect model performance and robustness, particularly concerning out-of-distribution (OOD) generalization and federated learning scenarios. Current research explores techniques like logit scaling and weight perturbations to improve OOD detection, and methods such as data augmentation and knowledge distillation to enhance in-distribution generalization and address data heterogeneity in decentralized settings. These efforts are crucial for building reliable and adaptable AI systems, improving their performance in real-world applications where data distributions are often non-uniform and may shift over time.
Papers
September 2, 2024
July 2, 2024
May 27, 2024
April 27, 2024
September 30, 2023
September 21, 2023
April 9, 2023
January 28, 2023
December 4, 2022
November 15, 2022
November 3, 2022
September 30, 2022
August 20, 2022
June 7, 2022
March 1, 2022
November 23, 2021