Gradient Dynamic
Gradient dynamics research investigates how the optimization process of neural networks shapes their learned parameters and ultimately their performance. Current efforts focus on understanding implicit regularization—the biases introduced by optimization algorithms like gradient descent—in various architectures, including single-neuron ReLU networks and deeper models with matrix factorization, analyzing their behavior through techniques like Wasserstein gradient flow and examining the role of over-parameterization. These studies aim to explain the generalization ability and emergent properties of neural networks, potentially leading to improved training strategies and a deeper theoretical understanding of deep learning.
Papers
October 2, 2024
February 8, 2024
February 23, 2023
December 29, 2022
October 7, 2022