Gradient Descent Dynamic
Gradient descent dynamics research explores how neural network training algorithms, primarily gradient descent and its variants, evolve over time. Current investigations focus on understanding the interplay between network architecture (including recurrent and feedforward networks), initialization strategies, and the resulting optimization trajectory, particularly concerning phenomena like the "edge of stability" and the impact of overparameterization. This research aims to improve training efficiency, generalization performance, and theoretical understanding of neural network learning, ultimately impacting the design and application of machine learning models across various domains.
Papers
November 12, 2024
November 1, 2024
May 31, 2024
May 28, 2024
April 7, 2024
March 4, 2024
November 3, 2023
October 2, 2023
August 22, 2023
July 10, 2023
June 30, 2023
May 1, 2023
March 12, 2023
February 20, 2023
February 3, 2023
October 11, 2022
March 4, 2022