Diagonal Preconditioners
Diagonal preconditioners are scaling matrices used in optimization algorithms to improve convergence speed by effectively reducing the condition number of the problem's underlying matrix. Current research focuses on developing efficient algorithms for computing optimal or near-optimal diagonal preconditioners, particularly within the context of training large neural networks (e.g., using methods like Adam, Shampoo, and variants) and solving large-scale linear systems. This work is significant because improved preconditioners lead to faster training of machine learning models and more efficient solutions to various scientific and engineering problems, ultimately impacting the scalability and performance of numerous applications.
Papers
November 19, 2024
March 21, 2024
February 13, 2024
February 5, 2024
October 27, 2023
September 12, 2023
June 5, 2023
May 23, 2023