Scaling Law
Scaling laws in machine learning aim to quantify the relationship between a model's performance and factors like its size, training data volume, and computational resources. Current research focuses on refining these laws across diverse model architectures, including transformers (both encoder-decoder and decoder-only), and optimization algorithms like SGD and AdamW, investigating their applicability to various tasks such as language modeling, translation, and image classification. Understanding these scaling laws is crucial for optimizing resource allocation in model development, improving training efficiency, and guiding the design of future, more powerful AI systems. Furthermore, the principles are being extended to explore economic productivity and the impact of data quality.
Papers
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou