Hessian Vector Product

Hessian vector products (HVPs) are crucial for approximating second-order information in machine learning optimization, enabling more efficient and robust training of complex models. Current research focuses on developing efficient algorithms for computing HVPs and their inverses, particularly within the context of bilevel optimization, stochastic gradient descent acceleration, and influence function calculations, often employing techniques like Lanczos methods, Lie group preconditioning, and random sketching to improve scalability and accuracy. These advancements are significant because efficient HVP computations are essential for improving the speed and stability of training large-scale models and for gaining deeper insights into model behavior and training dynamics.

Papers