Second Order
Second-order methods in machine learning leverage curvature information, primarily through Hessian matrices or their approximations, to improve optimization efficiency and model performance compared to first-order methods. Current research focuses on developing computationally tractable second-order algorithms, such as those employing diagonal Hessian approximations or low-rank matrix factorizations, for training large-scale models like LLMs and improving reinforcement learning. These advancements are significant because they offer faster convergence, enhanced generalization, and improved robustness in various applications, including image classification, natural language processing, and robotics.
Papers
December 20, 2024
December 18, 2024
December 4, 2024
November 27, 2024
November 18, 2024
October 29, 2024
October 25, 2024
October 21, 2024
October 18, 2024
October 12, 2024
October 10, 2024
October 8, 2024
October 3, 2024
September 30, 2024
September 26, 2024
September 25, 2024
September 20, 2024
August 31, 2024