Singular Learning

Singular learning theory (SLT) focuses on understanding and mitigating the challenges posed by singularities—points of non-smoothness—in the parameter spaces of complex models like deep neural networks and mixture models. Current research emphasizes developing tools like the local learning coefficient (LLC) to quantify model complexity and analyze learning dynamics in the presence of singularities, particularly within transformer and deep linear network architectures. This work aims to improve model training efficiency and generalization performance by addressing issues such as slow convergence and suboptimal solutions caused by these singularities, ultimately leading to a more robust and theoretically grounded understanding of deep learning.

Papers