Adaptive Optimizers
Adaptive optimizers dynamically adjust learning rates during training, aiming to improve the efficiency and generalization of deep learning models compared to traditional methods like SGD. Current research focuses on enhancing their stability, convergence rates, and generalization performance, particularly within the context of ResNets, Vision Transformers, and language models like GPT-2, often by incorporating techniques like adaptive friction, factorized momentum, and novel preconditioning matrices. These advancements are significant because they lead to faster training, improved model accuracy, and more efficient use of computational resources across diverse machine learning applications.
Papers
November 19, 2024
November 11, 2024
October 31, 2024
September 8, 2024
August 7, 2024
June 14, 2024
May 28, 2024
February 10, 2024
January 17, 2024
December 18, 2023
December 4, 2023
November 20, 2023
July 30, 2023
February 13, 2023
February 2, 2023
November 8, 2022
June 10, 2022
June 4, 2022
June 1, 2022
March 24, 2022