Efficient Training
Efficient training of large-scale machine learning models is a critical research area aiming to reduce computational costs and resource consumption while maintaining or improving model performance. Current efforts focus on optimizing training strategies for various architectures, including transformers, mixture-of-experts models, and neural operators, employing techniques like parameter-efficient fine-tuning, data pruning, and novel loss functions. These advancements are crucial for making advanced models like large language models and vision transformers more accessible and sustainable, impacting fields ranging from natural language processing and computer vision to scientific simulations and drug discovery.
Papers
May 8, 2024
May 2, 2024
April 26, 2024
April 9, 2024
March 31, 2024
March 23, 2024
March 18, 2024
March 2, 2024
February 4, 2024
February 2, 2024
February 1, 2024
January 22, 2024
January 19, 2024
January 16, 2024
January 11, 2024
January 9, 2024
January 5, 2024
December 12, 2023
November 30, 2023