Lazy Training

Lazy training describes a phenomenon in neural network training where model parameters remain close to their initial values, effectively behaving like a linear model despite the network's complexity. Current research focuses on understanding the conditions under which lazy training occurs, its relationship to generalization performance (including benign overfitting and adversarial robustness), and its implications for different architectures and training algorithms, such as gradient descent and biologically-plausible alternatives. This research is significant because it provides insights into the dynamics of deep learning, potentially leading to more efficient and robust training methods, and offering theoretical explanations for observed empirical phenomena like grokking.

Papers