Adaptive Moment Estimation
Adaptive Moment Estimation (Adam) is a popular optimization algorithm used to train deep learning models, aiming to efficiently find optimal model parameters by adapting learning rates for individual parameters. Current research focuses on improving Adam's theoretical understanding, particularly addressing its convergence properties under less restrictive assumptions and exploring modifications to enhance its robustness, efficiency (especially in federated learning), and performance in various applications. These efforts are significant because a more robust theoretical foundation and improved practical performance of Adam can lead to faster and more reliable training of complex machine learning models across diverse fields.
Papers
The Unified Balance Theory of Second-Moment Exponential Scaling Optimizers in Visual Tasks
Gongyue Zhang, Honghai Liu
Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization
Xiumei Deng, Jun Li, Kang Wei, Long Shi, Zeihui Xiong, Ming Ding, Wen Chen, Shi Jin, H. Vincent Poor