Training Deep
Training deep neural networks efficiently and effectively remains a central challenge in machine learning. Current research focuses on improving training algorithms (e.g., exploring second-order methods and adaptive gradient normalization), optimizing model architectures (e.g., reversible architectures and sparse mixtures of experts), and reducing computational costs (e.g., through gradient sampling, model compression, and efficient distributed training). These advancements aim to enhance model performance, reduce energy consumption, and enable training on larger datasets or resource-constrained devices, impacting various applications from medical image analysis to financial modeling.
Papers
May 1, 2023
April 4, 2023
April 3, 2023
March 31, 2023
March 19, 2023
March 8, 2023
March 1, 2023
February 13, 2023
December 1, 2022
October 8, 2022
October 3, 2022
September 21, 2022
August 30, 2022
July 27, 2022
May 29, 2022
May 27, 2022
May 10, 2022
March 4, 2022
December 31, 2021