Training Dynamic
Training dynamics in neural networks investigates how model parameters evolve during training, aiming to understand and optimize the learning process for improved performance and efficiency. Current research focuses on characterizing training dynamics across various architectures, including transformers and convolutional networks, using techniques like contrastive learning and analyzing loss landscapes to identify optimal training protocols and mitigate issues like catastrophic forgetting and reward hacking in reinforcement learning from human feedback (RLHF). These studies are crucial for developing more efficient training methods, improving model generalization, and ultimately advancing the capabilities of artificial intelligence across diverse applications.
Papers
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
Everlyn Asiko Chimoto, Jay Gala, Orevaoghene Ahia, Julia Kreutzer, Bruce A. Bassett, Sara Hooker
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He, Lorenzo Noci, Daniele Paliotta, Imanol Schlag, Thomas Hofmann
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu, Santiago Aranguri, Arthur Jacot
How Does Perfect Fitting Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks
Yuval Sharon, Yehuda Dar
On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability
Chenyu Zheng, Wei Huang, Rongzhen Wang, Guoqiang Wu, Jun Zhu, Chongxuan Li