Adaptive Gradient Method
Adaptive gradient methods are optimization algorithms that dynamically adjust learning rates during training, aiming to accelerate convergence and improve performance compared to methods with fixed learning rates. Current research focuses on analyzing the convergence properties of algorithms like AdaGrad and Adam under various assumptions about the objective function (e.g., smoothness, convexity) and noise conditions, particularly in large-batch and federated learning settings. These analyses are crucial for understanding the strengths and limitations of adaptive methods, ultimately leading to more robust and efficient training of complex models in diverse applications such as deep learning and large language models.
Papers
July 17, 2024
June 21, 2024
June 7, 2024
March 11, 2024
February 17, 2024
February 5, 2024
February 1, 2024
January 12, 2024
January 6, 2024
December 23, 2023
August 13, 2023
July 5, 2023
June 11, 2023
February 13, 2023
November 2, 2022
October 31, 2022
October 28, 2022
October 5, 2022
September 4, 2022
July 29, 2022