Model Scaling
Model scaling investigates how increasing model size, training data, and computational resources impacts the performance of large language models (LLMs) and other deep learning architectures, aiming to optimize performance and efficiency. Current research focuses on understanding scaling laws across various model types, including transformers and graph neural networks, and exploring techniques like parameter-efficient tuning and test-time computation optimization to maximize performance gains. These findings are crucial for advancing both the theoretical understanding of deep learning and the development of more powerful and resource-efficient AI systems across diverse applications.
Papers
October 16, 2024
October 8, 2024
September 5, 2024
September 4, 2024
August 24, 2024
August 6, 2024
May 12, 2024
April 23, 2024
April 22, 2024
April 9, 2024
April 3, 2024
February 21, 2024
February 3, 2024
January 24, 2024
October 3, 2023
September 29, 2023
September 19, 2023
June 4, 2023
March 24, 2023