Early Stage Convergence
Early stage convergence in machine learning focuses on understanding and improving the initial phases of training algorithms, aiming to accelerate convergence speed and enhance generalization performance. Current research investigates this through the lens of various optimization algorithms (e.g., Adam, SGD, FedProx), model architectures (e.g., transformers, diffusion models), and specific problem domains (e.g., federated learning, collaborative filtering). These studies leverage techniques from dynamical systems theory and optimal transport to establish convergence guarantees and bounds, ultimately contributing to more efficient and robust machine learning systems across diverse applications.
Papers
On the Convergence of Optimizing Persistent-Homology-Based Losses
Yikai Zhang, Jiachen Yao, Yusu Wang, Chao Chen
Memory-efficient model-based deep learning with convergence and robustness guarantees
Aniket Pramanik, M. Bridget Zimmerman, Mathews Jacob
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding, Kaiqing Zhang, Jiali Duan, Tamer Başar, Mihailo R. Jovanović
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions
Kiwon Lee, Andrew N. Cheng, Courtney Paquette, Elliot Paquette
Faster Rates of Convergence to Stationary Points in Differentially Private Optimization
Raman Arora, Raef Bassily, Tomás González, Cristóbal Guzmán, Michael Menart, Enayat Ullah
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Wuyang Chen, Wei Huang, Xinyu Gong, Boris Hanin, Zhangyang Wang
An Efficient Summation Algorithm for the Accuracy, Convergence and Reproducibility of Parallel Numerical Methods
Farah Benmouhoub, Pierre-Loïc Garoche, Matthieu Martel