Gradient Flow
Gradient flow, the continuous-time limit of gradient descent, is a powerful tool for analyzing the training dynamics of machine learning models, particularly deep neural networks. Current research focuses on understanding gradient flow's behavior in various architectures (e.g., ResNets, transformers) and optimization algorithms (e.g., stochastic gradient descent, mirror descent), investigating phenomena like oversmoothing and the impact of different regularizations. This analysis helps explain implicit biases, convergence rates, and the effectiveness of various training techniques, ultimately leading to improved model design and training strategies. The insights gained are crucial for enhancing the performance and stability of machine learning algorithms across diverse applications.
Papers
Differentially Private Gradient Flow based on the Sliced Wasserstein Distance for Non-Parametric Generative Modeling
Ilana Sebag, Muni Sreenivas PYDI, Jean-Yves Franceschi, Alain Rakotomamonjy, Mike Gartrell, Jamal Atif, Alexandre Allauzen
On the Dynamics Under the Unhinged Loss and Beyond
Xiong Zhou, Xianming Liu, Hanzhang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji