Gradient Flow
Gradient flow, the continuous-time limit of gradient descent, is a powerful tool for analyzing the training dynamics of machine learning models, particularly deep neural networks. Current research focuses on understanding gradient flow's behavior in various architectures (e.g., ResNets, transformers) and optimization algorithms (e.g., stochastic gradient descent, mirror descent), investigating phenomena like oversmoothing and the impact of different regularizations. This analysis helps explain implicit biases, convergence rates, and the effectiveness of various training techniques, ultimately leading to improved model design and training strategies. The insights gained are crucial for enhancing the performance and stability of machine learning algorithms across diverse applications.
Papers
MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows
Mingxuan Yi, Zhanxing Zhu, Song Liu
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
Francois Caron, Fadhel Ayed, Paul Jung, Hoil Lee, Juho Lee, Hongseok Yang
Lipschitz-regularized gradient flows and generative particle algorithms for high-dimensional scarce data
Hyemin Gu, Panagiota Birmpa, Yannis Pantazis, Luc Rey-Bellet, Markos A. Katsoulakis
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo Zhao, Iordan Ganev, Robin Walters, Rose Yu, Nima Dehmamy