Second Order
Second-order methods in machine learning leverage curvature information, primarily through Hessian matrices or their approximations, to improve optimization efficiency and model performance compared to first-order methods. Current research focuses on developing computationally tractable second-order algorithms, such as those employing diagonal Hessian approximations or low-rank matrix factorizations, for training large-scale models like LLMs and improving reinforcement learning. These advancements are significant because they offer faster convergence, enhanced generalization, and improved robustness in various applications, including image classification, natural language processing, and robotics.
Papers
November 1, 2023
October 31, 2023
October 30, 2023
October 29, 2023
October 28, 2023
October 23, 2023
October 18, 2023
September 26, 2023
September 14, 2023
August 25, 2023
August 16, 2023
August 4, 2023
August 1, 2023
June 2, 2023
May 30, 2023
May 27, 2023
May 26, 2023