Infinite Depth and Width Limit
Research on the infinite depth and width limit of neural networks aims to understand the behavior of extremely large networks by analyzing their asymptotic properties. Current work focuses on characterizing these limits using differential equations and stochastic processes, particularly for architectures like ResNets, MLPs, and Transformers, often involving modifications like "shaped" activations or skip connections to ensure well-defined limits. These analyses provide insights into network trainability, covariance structure, and the relationship between depth and width, potentially leading to improved network design and a deeper understanding of deep learning's fundamental mechanisms.
Papers
November 22, 2024
November 17, 2024
October 18, 2023
June 30, 2023
March 30, 2023
February 1, 2023
October 3, 2022