Two Layer Neural Network

Two-layer neural networks serve as a fundamental model for understanding the behavior of deeper networks, with research focusing on their optimization dynamics, generalization capabilities, and feature learning properties. Current investigations utilize stochastic gradient descent and related algorithms, often within the context of the neural tangent kernel approximation, to analyze convergence rates and the impact of hyperparameters like learning rate and network width. These studies provide crucial insights into the theoretical foundations of deep learning, informing the design of more efficient and robust algorithms and offering a clearer understanding of phenomena like spectral bias and the emergence of skills during training.

Papers

May 27, 2024

Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy, Nikola Konstantinov
Deep Model Gradient Flow Two Layer Neural Network Simplicity Bias Separable Data Two Layer Network

May 26, 2024

Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks
Leyang Zhang, Yaoyu Zhang, Tao Luo
Neural Network Geometric Analysis Two Layer Neural Network Saddle to Saddle Dynamic

May 23, 2024

May 22, 2024

Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Jiajie Zhao, Zhiwei Bai, Yaoyu Zhang
Strong Generalization Two Layer Neural Network Data Imbalance Individual Neuron Easy to Hard Generalization Sample Size Initialization Bias Neuron Tracing

April 29, 2024

Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks
Fanghui Liu, Leello Dadi, Volkan Cevher
LeArning Abstract Kernel Method Kernel Hilbert Space Two Layer Neural Network Generalization Property Target Norm Metric Entropy

April 26, 2024

An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee, Chris Mingard, Ard A. Louis
Neural Network Deep Learning Model Path Breaking Emergence Two Layer Neural Network Emergence Dynamic

April 20, 2024

Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions
Sota Nishiyama, Masayuki Ohzeki
Machine Learning Model Activation Function Two Layer Neural Network Solution Space Input Output Pair

March 22, 2024

Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Shokichi Takakura, Taiji Suzuki
Mean Field Kernel Method Two Layer Neural Network Kernel Based Data Dependent Kernel

February 25, 2024

On the dynamics of three-layer neural networks: initial condensation
Zheng-An Chen, Tao Luo
Neural Network Two Layer Neural Network Three Layer Neural Network Initial Condensation

February 14, 2024

Measuring Sharpness in Grokking
Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, Noam Levi
Two Layer Neural Network Robust Framework Grokking Phenomenon Sharpness Measure Sharpness Aware Optimization Validation Performance

February 7, 2024

Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro
High Dimensional Feature Learning Two Layer Neural Network Asymptotic Behavior Two Layer Network Kernel Regime Gradient Step Gaussian Universality

February 5, 2024

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala
Gradient Descent Gradient Flow Full Information Two Layer Neural Network Large Batch Two Layer Network Pas Stochastic Gradient Descent

February 1, 2024

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features
Aku Kammonen, Lisi Liang, Anamika Pandey, Raúl Tempone
Native Robustness Stochastic Gradient Descent Two Layer Neural Network Adversarial Noise Spectral Bias Random Fourier Feature

January 19, 2024

November 21, 2023

In-Context Learning Functions with Varying Number of Minima
David Oniani, Yanshan Wang
Context Learning Numerical Data Two Layer Neural Network Function Approximation

October 29, 2023

Proving Linear Mode Connectivity of Neural Networks via Optimal Transport
Damien Ferbach, Baptiste Goujaud, Gauthier Gidel, Aymeric Dieuleveut
Neural Network Deep Neural Network Optimal Transport Deep Learning Architecture Two Layer Neural Network Non Convex Optimization Problem Linear Mode Connectivity

October 16, 2023

Approximating Two-Layer Feedforward Networks for Efficient Transformers
Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Two Layer Neural Network Efficient Transformer Sparse Mixture Efficient Large Language Model

October 12, 2023

Differentially Private Non-convex Learning for Multi-layer Neural Networks
Hanpu Shen, Cheng-Long Wang, Zihang Xiang, Yiming Ying, Di Wang
Neural Tangent Kernel Two Layer Neural Network Private Stochastic Multi Layer Neural Network