Second Order
Second-order methods in machine learning leverage curvature information, primarily through Hessian matrices or their approximations, to improve optimization efficiency and model performance compared to first-order methods. Current research focuses on developing computationally tractable second-order algorithms, such as those employing diagonal Hessian approximations or low-rank matrix factorizations, for training large-scale models like LLMs and improving reinforcement learning. These advancements are significant because they offer faster convergence, enhanced generalization, and improved robustness in various applications, including image classification, natural language processing, and robotics.
Papers
June 27, 2022
June 26, 2022
June 20, 2022
June 6, 2022
May 25, 2022
May 17, 2022
May 13, 2022
May 4, 2022
March 11, 2022
February 28, 2022
February 12, 2022
February 9, 2022
February 2, 2022
January 28, 2022
January 22, 2022
December 24, 2021
December 8, 2021
November 30, 2021