Paper ID: 2412.18624
How to explain grokking
S.V. Kozyrev
Explanation of grokking (delayed generalization) in learning is given by modeling grokking by the stochastic gradient Langevin dynamics (Brownian motion) and applying the ideas of thermodynamics.
Submitted: Dec 17, 2024