Training Deep
Training deep neural networks efficiently and effectively remains a central challenge in machine learning. Current research focuses on improving training algorithms (e.g., exploring second-order methods and adaptive gradient normalization), optimizing model architectures (e.g., reversible architectures and sparse mixtures of experts), and reducing computational costs (e.g., through gradient sampling, model compression, and efficient distributed training). These advancements aim to enhance model performance, reduce energy consumption, and enable training on larger datasets or resource-constrained devices, impacting various applications from medical image analysis to financial modeling.
Papers
December 17, 2024
October 8, 2024
September 23, 2024
August 19, 2024
August 2, 2024
June 15, 2024
June 11, 2024
June 4, 2024
May 23, 2024
May 15, 2024
April 3, 2024
March 7, 2024
February 4, 2024
January 8, 2024
October 16, 2023
October 3, 2023
August 2, 2023
July 5, 2023
May 31, 2023
May 23, 2023