Diagonal Preconditioners

Diagonal preconditioners are scaling matrices used in optimization algorithms to improve convergence speed by effectively reducing the condition number of the problem's underlying matrix. Current research focuses on developing efficient algorithms for computing optimal or near-optimal diagonal preconditioners, particularly within the context of training large neural networks (e.g., using methods like Adam, Shampoo, and variants) and solving large-scale linear systems. This work is significant because improved preconditioners lead to faster training of machine learning models and more efficient solutions to various scientific and engineering problems, ultimately impacting the scalability and performance of numerous applications.

Papers