Paper ID: 2412.18624

How to explain grokking

S.V. Kozyrev

Explanation of grokking (delayed generalization) in learning is given by modeling grokking by the stochastic gradient Langevin dynamics (Brownian motion) and applying the ideas of thermodynamics.

Submitted: Dec 17, 2024