Model Overfitting
Model overfitting, where a machine learning model learns the training data too well, hindering its ability to generalize to unseen data, remains a central challenge in the field. Current research focuses on understanding and mitigating overfitting in various contexts, including self-supervised learning, large language models, and continual learning, often employing techniques like regularization, ensemble methods, and data augmentation tailored to specific architectures (e.g., transformers, convolutional neural networks). Addressing overfitting is crucial for improving the reliability and robustness of machine learning models across diverse applications, from image recognition and natural language processing to scientific discovery and engineering design. This ongoing work aims to develop more generalizable and trustworthy AI systems.
Papers
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation
Simone Ciceri, Lorenzo Cassani, Matteo Osella, Pietro Rotondo, Filippo Valle, Marco Gherardi
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model
Gengwei Zhang, Liyuan Wang, Guoliang Kang, Ling Chen, Yunchao Wei