Gradient Dynamic

Gradient dynamics research investigates how the optimization process of neural networks shapes their learned parameters and ultimately their performance. Current efforts focus on understanding implicit regularization—the biases introduced by optimization algorithms like gradient descent—in various architectures, including single-neuron ReLU networks and deeper models with matrix factorization, analyzing their behavior through techniques like Wasserstein gradient flow and examining the role of over-parameterization. These studies aim to explain the generalization ability and emergent properties of neural networks, potentially leading to improved training strategies and a deeper theoretical understanding of deep learning.

Papers