Model Scaling
Model scaling investigates how increasing model size, training data, and computational resources impacts the performance of large language models (LLMs) and other deep learning architectures, aiming to optimize performance and efficiency. Current research focuses on understanding scaling laws across various model types, including transformers and graph neural networks, and exploring techniques like parameter-efficient tuning and test-time computation optimization to maximize performance gains. These findings are crucial for advancing both the theoretical understanding of deep learning and the development of more powerful and resource-efficient AI systems across diverse applications.
Papers
June 4, 2023
March 24, 2023
February 19, 2023
July 15, 2022
May 24, 2022
May 2, 2022
December 6, 2021
November 28, 2021