Learning Curve
Learning curves, which plot a machine learning model's performance against training data size, are crucial for understanding model behavior and optimizing training processes. Current research focuses on improving learning curve estimation techniques, particularly for deep learning models like convolutional neural networks (CNNs) and language models, to predict performance, reduce training time, and identify optimal hyperparameters. This work addresses issues like overoptimism in published results and the unexpected phenomenon of "grokking," where performance suddenly improves after a long period of stagnation, ultimately aiming to enhance the efficiency and reliability of machine learning across various applications.
Papers
October 23, 2024
October 11, 2024
October 10, 2024
May 23, 2024
October 12, 2023
August 29, 2023
July 6, 2023
June 23, 2023
March 1, 2023
November 25, 2022
August 31, 2022
August 4, 2022
May 30, 2022
May 22, 2022
March 22, 2022
January 28, 2022
December 3, 2021