Model Overfitting
Model overfitting, where a machine learning model learns the training data too well, hindering its ability to generalize to unseen data, remains a central challenge in the field. Current research focuses on understanding and mitigating overfitting in various contexts, including self-supervised learning, large language models, and continual learning, often employing techniques like regularization, ensemble methods, and data augmentation tailored to specific architectures (e.g., transformers, convolutional neural networks). Addressing overfitting is crucial for improving the reliability and robustness of machine learning models across diverse applications, from image recognition and natural language processing to scientific discovery and engineering design. This ongoing work aims to develop more generalizable and trustworthy AI systems.
Papers
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
Jie Chen, Yupeng Zhang, Bingning Wang, Wayne Xin Zhao, Ji-Rong Wen, Weipeng Chen
AEM: Attention Entropy Maximization for Multiple Instance Learning based Whole Slide Image Classification
Yunlong Zhang, Zhongyi Shui, Yunxuan Sun, Honglin Li, Jingxiong Li, Chenglu Zhu, Lin Yang