Infinite Depth and Width Limit

Research on the infinite depth and width limit of neural networks aims to understand the behavior of extremely large networks by analyzing their asymptotic properties. Current work focuses on characterizing these limits using differential equations and stochastic processes, particularly for architectures like ResNets, MLPs, and Transformers, often involving modifications like "shaped" activations or skip connections to ensure well-defined limits. These analyses provide insights into network trainability, covariance structure, and the relationship between depth and width, potentially leading to improved network design and a deeper understanding of deep learning's fundamental mechanisms.

Papers