Efficient Training
Efficient training of large-scale machine learning models is a critical research area aiming to reduce computational costs and resource consumption while maintaining or improving model performance. Current efforts focus on optimizing training strategies for various architectures, including transformers, mixture-of-experts models, and neural operators, employing techniques like parameter-efficient fine-tuning, data pruning, and novel loss functions. These advancements are crucial for making advanced models like large language models and vision transformers more accessible and sustainable, impacting fields ranging from natural language processing and computer vision to scientific simulations and drug discovery.
Papers
November 11, 2024
November 4, 2024
November 1, 2024
October 30, 2024
October 29, 2024
October 28, 2024
October 27, 2024
October 15, 2024
October 4, 2024
September 30, 2024
September 27, 2024
September 20, 2024
September 17, 2024
September 9, 2024
September 7, 2024
September 6, 2024
September 2, 2024
August 23, 2024
August 20, 2024
August 13, 2024