Diagonal Linear Network
Diagonal linear networks (DLNs) are simplified neural network models used to study implicit regularization—the phenomenon where optimization algorithms implicitly prefer certain solutions, such as sparse ones, even without explicit regularization terms. Current research focuses on understanding the dynamics of gradient descent and its variants (including momentum) on DLNs, analyzing their convergence properties, and characterizing the resulting solutions in terms of sparsity and other structural properties. This research contributes to a deeper understanding of how neural networks generalize and learn, potentially informing the design of more efficient and robust algorithms for various machine learning tasks.
Papers
September 3, 2024
July 13, 2024
June 4, 2024
March 8, 2024
November 25, 2023
July 13, 2023
April 2, 2023
February 17, 2023
January 30, 2023
January 29, 2023