Neural Scaling Law

Neural scaling laws describe how the performance of deep neural networks improves with increases in model size, training data, and computational resources. Current research focuses on refining theoretical understanding of these scaling relationships, particularly investigating the interplay between these factors and exploring how architectural choices (e.g., modularity, feature learning) and training methods (e.g., stochastic gradient descent, adaptive sampling) affect scaling behavior across diverse tasks and model types (including transformers, graph neural networks, and those used in embodied AI). These laws are crucial for optimizing resource allocation in deep learning, guiding the design of more efficient and effective models, and improving our fundamental understanding of the learning process itself.

Papers

February 7, 2024

A Resource Model For Neural Scaling Law
Jinyeop Song, Ziming Liu, Max Tegmark, Jeff Gore
Neural Network Neural Topic Neural Scaling Law General Task Specific Model

February 3, 2024

Towards Neural Scaling Laws on Graphs
Jingzhe Liu, Haitao Mao, Zhikai Chen, Tong Zhao, Neil Shah, Jiliang Tang
Graph Drawing Graph Transformer Large Graph Neural Scaling Law Model Scaling Data Scaling Deep Graph Model

February 2, 2024

A Dynamical Model of Neural Scaling Laws
Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
Gradient Descent Multiplicative Size Scaling Dynamical Model Neural Scaling Law Optimal Scaling

February 1, 2024

LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law
Toni J.B. Liu, Nicolas Boullé, Raphaël Sarfati, Christopher J. Earls
Large Language Model Language Model Dynamical System General Principle Neural Scaling Law

November 22, 2023

Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies
Shabnam Daghaghi, Benjamin Coleman, Benito Geordie, Anshumali Shrivastava
Deep Learning Open Sampling Adaptive Sampling Neural Scaling Law Data Sampling Proxy Dataset Dynamic Sampling Nadaraya Watson

October 3, 2023

A Neural Scaling Law from Lottery Ticket Ensembling
Ziming Liu, Max Tegmark
Scaling Law Lottery Ticket Neural Scaling Law Single Neural Network Central Limit Theorem

September 15, 2023

Uncovering Neural Scaling Laws in Molecular Representation Learning
Dingshuo Chen, Yanqiao Zhu, Jieyu Zhang, Yuanqi Du, Zhixun Li, Qiang Liu, Shu Wu, Liang Wang
Molecular Representation Virtual Screening Molecular Data Neural Scaling Law

July 18, 2023

The semantic landscape paradigm for neural networks
Shreyas Gokhale
Neural Network Neural Process Neural Scaling Law Emergent Computation

July 5, 2023

Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources
Feiyang Kang, Hoang Anh Just, Anit Kumar Sahu, Ruoxi Jia
Optimal Transport Data Selection Source Table Neural Scaling Law Strong Scaling Data Mixture

March 23, 2023

The Quantization Model of Neural Scaling
Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark
Large Language Model Neural Scaling Law Extreme Quantization Quantization Model Power Law Scaling

February 17, 2023

A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes
Łukasz Dębowski
Large Language Model Cross Entropy Hawkes Process Neural Scaling Law Simple Model Entropy Rate

February 14, 2023

Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed, Soufiane Hayou
Fundamental Limitation Score Based Data Pruning Neural Scaling Law Random Pruning

December 2, 2022

An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws
Hong Jun Jeon, Benjamin Van Roy
Large Language Model Language Model Neural Network Trade Offs Neural Scaling Law Optimal Scaling

October 30, 2022

A Solvable Model of Neural Scaling Laws
Alexander Maloney, Daniel A. Roberts, James Sully
Large Language Model Random Feature Neural Scaling Law Power Law Scaling Parameter Scaling

October 26, 2022

Broken Neural Scaling Laws
Ethan Caballero, Kshitij Gupta, Irina Rish, David Krueger
Transfer Learning Neural Scaling Law Scaling Behavior

September 29, 2022

Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann, Claudius Gros
Multi Agent Reinforcement Learning Scaling Law Neural Scaling Law Power Law Scaling

September 13, 2022

Revisiting Neural Scaling Laws in Language and Vision
Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
Deep Learning Human Language Vision Paper Scaling Law Neural Scaling Law Poor Extrapolation Performance BIG Bench

July 4, 2022

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Rafid Mahmood, James Lucas, David Acuna, Daiqing Li, Jonah Philion, Jose M. Alvarez, Zhiding Yu, Sanja Fidler, Marc T. Law
Downstream Task Computer Vision Task Linear Estimator Neural Scaling Law Inclusion Requirement Small Training Validation Performance

June 29, 2022

Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, Ari S. Morcos
Large Scale Multiplicative Size Scaling Data Pruning Neural Scaling Law Power Law Scaling

November 7, 2021

Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator
Homer Walke, Daniel Ritter, Carl Trimbach, Michael Littman
Neural Operator Temporal Logic Neural Scaling Law