Gradient Descent Dynamic

Gradient descent dynamics research explores how neural network training algorithms, primarily gradient descent and its variants, evolve over time. Current investigations focus on understanding the interplay between network architecture (including recurrent and feedforward networks), initialization strategies, and the resulting optimization trajectory, particularly concerning phenomena like the "edge of stability" and the impact of overparameterization. This research aims to improve training efficiency, generalization performance, and theoretical understanding of neural network learning, ultimately impacting the design and application of machine learning models across various domains.

Papers