Diagonal Hessian

Diagonal Hessian approximation focuses on efficiently estimating the diagonal elements of the Hessian matrix, a crucial component in second-order optimization methods for machine learning. Current research emphasizes developing computationally inexpensive algorithms, such as improved versions of older methods and novel approaches like HesScale, to approximate the diagonal Hessian for use in preconditioning gradient descent or directly in second-order optimization. This research is significant because it enables the application of more sophisticated optimization techniques to large-scale models, leading to faster training and improved performance in areas like reinforcement learning and language model pre-training.

Papers