Adaptive Gradient
Adaptive gradient methods, which adjust learning rates for individual parameters during training, aim to improve the efficiency and effectiveness of deep learning optimization compared to standard methods like stochastic gradient descent (SGD). Current research focuses on understanding their theoretical convergence properties, particularly in large-batch settings and non-convex optimization problems, with algorithms like Adam and AdaGrad being central to these investigations. This research is crucial for advancing deep learning, as improved optimization techniques directly impact the training speed, generalization performance, and scalability of large-scale models across various applications.
Papers
June 21, 2024
April 9, 2024
February 17, 2024
February 5, 2024
September 15, 2023
August 3, 2023
March 7, 2023
January 27, 2023
September 4, 2022
August 13, 2022
June 10, 2022
June 4, 2022
May 20, 2022
February 1, 2022
January 26, 2022