Scaling Law
Scaling laws in machine learning aim to quantify the relationship between a model's performance and factors like its size, training data volume, and computational resources. Current research focuses on refining these laws across diverse model architectures, including transformers (both encoder-decoder and decoder-only), and optimization algorithms like SGD and AdamW, investigating their applicability to various tasks such as language modeling, translation, and image classification. Understanding these scaling laws is crucial for optimizing resource allocation in model development, improving training efficiency, and guiding the design of future, more powerful AI systems. Furthermore, the principles are being extended to explore economic productivity and the impact of data quality.
Papers
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu
Inference Scaling $\scriptsize\mathtt{F}$Laws: The Limits of LLM Resampling with Imperfect Verifiers
Benedikt Stroebl, Sayash Kapoor, Arvind Narayanan
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin, Yaqi Zhao, Mingwu Zheng, Ke Lin, Jiarong Ou, Rui Chen, Victor Shea-Jay Huang, Jiahao Wang, Xin Tao, Pengfei Wan, Di Zhang, Baoqun Yin, Wentao Zhang, Kun Gai
Scaling Laws for Black box Adversarial Attacks
Chuan Liu, Huanran Chen, Yichi Zhang, Yinpeng Dong, Jun Zhu