Deep Linear

Deep linear networks, simplified models of deep neural networks, are used to gain theoretical insights into the learning dynamics and generalization capabilities of their more complex counterparts. Current research focuses on understanding the impact of initialization strategies, optimization algorithms (including gradient descent and predictive coding), and architectural features (like depth and width) on learning dynamics and implicit regularization, often analyzing specific model architectures like linear state space models and convolutional networks. These studies provide crucial theoretical foundations for understanding phenomena like neural collapse and critical learning periods, ultimately informing the design and improvement of more efficient and robust deep learning algorithms for various applications, including image deblurring and matrix completion.

Papers