Early Stage Convergence

Early stage convergence in machine learning focuses on understanding and improving the initial phases of training algorithms, aiming to accelerate convergence speed and enhance generalization performance. Current research investigates this through the lens of various optimization algorithms (e.g., Adam, SGD, FedProx), model architectures (e.g., transformers, diffusion models), and specific problem domains (e.g., federated learning, collaborative filtering). These studies leverage techniques from dynamical systems theory and optimal transport to establish convergence guarantees and bounds, ultimately contributing to more efficient and robust machine learning systems across diverse applications.

Papers

January 26, 2023

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games
Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm
Early Stage Convergence Dynamic Regret General Sum Game Regret Dynamic Dependent Regret

January 25, 2023

Graph Neural Tangent Kernel: Convergence on Large Graphs
Sanjukta Krishnagopal, Luana Ruiz
Graph Neural Network Early Stage Convergence Graph Data Graphon Estimation Graph Neural Tangent Kernel

January 23, 2023

January 19, 2023

Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin, Kevin Scaman, Marc Lelarge
Loss Function Early Stage Convergence Gradient Flow Loss Minimization Parameterized Regime Parametric Learning Rayleigh Regression

January 17, 2023

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes
Konstantin Mishchenko, Slavomír Hanzely, Peter Richtárik
Early Stage Convergence Hessian Matrix Convergence Guarantee First Order Algorithm Agnostic Meta learnIng Gradient Step Moreau Envelope

January 5, 2023

Beyond spectral gap (extended): The role of the topology in decentralized learning
Thijs Vogels, Hadrien Hendrikx, Martin Jaggi
Integral Role Early Stage Convergence Sparse Graph Topology Problem Parallel Optimization Spectral Gap

January 4, 2023

On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats
Matteo Cacciola, Antonio Frangioni, Masoud Asgharian, Alireza Ghaffari, Vahid Partovi Nia
Stochastic Gradient Descent Early Stage Convergence Low Precision

January 1, 2023

Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK
Hongru Yang, Ziyu Jiang, Ruizhe Zhang, Yingbin Liang, Zhangyang Wang
Strong Generalization Early Stage Convergence Neural Tangent Kernel ReLU Network Sparse Network Wide Neural Network Sparse Activation Relative Stance Bias

December 28, 2022

On the Convergence of Discounted Policy Gradient Methods
Chris Nota
Reinforcement Learning Policy Gradient Early Stage Convergence Estimation Bias Gradient Ascent

December 22, 2022

Improving Convergence for Quantum Variational Classifiers using Weight Re-Mapping
Michael Kölle, Alessandro Giovagnoli, Jonas Stein, Maximilian Balthasar Mansky, Julian Hager, Claudia Linnhoff-Popien
Quantum Machine Learning Early Stage Convergence Quantum Computing Variational Quantum Circuit Variational Learning Variational Quantum Classifier Weight Re Mapping

December 4, 2022

Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness
Ruggiero Seccia, Corrado Coppola, Giampaolo Liuzzi, Laura Palagi
Early Stage Convergence Natural Gradient Line Search Random Reshuffling Full Batch Gradient Descent Incremental Gradient

November 21, 2022

EM's Convergence in Gaussian Latent Tree Models
Yuval Dagan, Constantinos Daskalakis, Anthimos Vardis Kandiros
Early Stage Convergence Expectation Maximization Log Likelihood Latent Tree

November 15, 2022

The rate of convergence of Bregman proximal methods: Local geometry vs. regularity vs. sharpness
Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos
Early Stage Convergence Proximal Gradient Variational Inequality Last Iterate Convergence Node Feature Sharpness Bregman Information Local Geometry Linear Convergence Rate

November 7, 2022

Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models
Yuchen Wu, Kangjie Zhou
Early Stage Convergence Lower Bound Tensor Decomposition Tensor Based Tensor Power Random Tensor Power Iteration

November 3, 2022

Geometry and convergence of natural policy gradient methods
Johannes Müller, Guido Montúfar
Markov Decision Process Early Stage Convergence Geometric Analysis Natural Policy Gradient Local Convergence Policy Parametrization

November 2, 2022

Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence
Kaylee Yingxi Yang, Andre Wibisono
Early Stage Convergence Langevin Dynamic Score Based Generative KL Divergence Score Estimation Provable Convergence

November 1, 2022

Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
Michael Giegrich, Christoph Reisinger, Yufei Zhang
Policy Gradient Early Stage Convergence Gaussian Policy

October 28, 2022

Nonuniqueness and Convergence to Equivalent Solutions in Observer-based Inverse Reinforcement Learning
Jared Town, Zachary Morrison, Rushikesh Kamalapurkar
Early Stage Convergence Inverse Reinforcement Learning Multiple Solution

October 18, 2022

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games
Constantinos Daskalakis, Noah Golowich, Stratis Skoulakis, Manolis Zampetakis
Early Stage Convergence Objective Function Non Convex Non Convex Objective

Early Stage Convergence

Papers

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Graph Neural Tangent Kernel: Convergence on Large Graphs

On The Convergence Of Policy Iteration-Based Reinforcement Learning With Monte Carlo Policy Evaluation

On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality

Convergence beyond the over-parameterized regime using Rayleigh quotients

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

Beyond spectral gap (extended): The role of the topology in decentralized learning

On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK

On the Convergence of Discounted Policy Gradient Methods

Improving Convergence for Quantum Variational Classifiers using Weight Re-Mapping

Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness

EM's Convergence in Gaussian Latent Tree Models

The rate of convergence of Bregman proximal methods: Local geometry vs. regularity vs. sharpness

Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models

Geometry and convergence of natural policy gradient methods

Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence

Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems

Nonuniqueness and Convergence to Equivalent Solutions in Observer-based Inverse Reinforcement Learning

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games