Model Overfitting
Model overfitting, where a machine learning model learns the training data too well, hindering its ability to generalize to unseen data, remains a central challenge in the field. Current research focuses on understanding and mitigating overfitting in various contexts, including self-supervised learning, large language models, and continual learning, often employing techniques like regularization, ensemble methods, and data augmentation tailored to specific architectures (e.g., transformers, convolutional neural networks). Addressing overfitting is crucial for improving the reliability and robustness of machine learning models across diverse applications, from image recognition and natural language processing to scientific discovery and engineering design. This ongoing work aims to develop more generalizable and trustworthy AI systems.
Papers
Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers
Shuning Shang, Xuran Meng, Yuan Cao, Difan Zou
Provable Tempered Overfitting of Minimal Nets and Typical Nets
Itamar Harel, William M. Hoza, Gal Vardi, Itay Evron, Nathan Srebro, Daniel Soudry
Adjusted Overfitting Regression
Dylan Wilson