Epoch Wise Double Descent

Epoch-wise double descent describes the surprising phenomenon where a machine learning model's generalization performance improves after initially overfitting the training data, exhibiting two distinct phases of decreasing error during training. Current research focuses on understanding this behavior in various architectures, including linear and two-layer neural networks, and deep networks like Transformers, investigating the roles of factors such as input variance, singular values of covariance matrices, and the evolution of learned representations across layers. This research is crucial for improving training strategies, potentially leading to better generalization and more efficient use of computational resources in diverse applications, including time series forecasting.

Papers

July 13, 2024

Towards Understanding Epoch-wise Double descent in Two-layer Linear Neural Networks
Amanda Olmin, Fredrik Lindsten
Double Descent Linear Neural Network Diagonal Linear Network Epoch Wise Double Descent

May 27, 2024

How Does Perfect Fitting Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks
Yuval Sharon, Yehuda Dar
Deep Neural Network Meaningful Representation DNN Model Training Dynamic Representational Similarity Epoch Wise Double Descent

November 2, 2023

Deep Double Descent for Time Series Forecasting: Avoiding Undertrained Models
Valentino Assandri, Sam Heshmati, Burhaneddin Yaman, Anton Iakovlev, Ariel Emiliano Repetur
Deep Learning Model Transformer Based Model Time Series Forecasting Double Descent Unlearned Model Epoch Wise Double Descent

June 8, 2023

Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances
Marcel Kühn, Bernd Rosenow
Stochastic Gradient Descent Future Implication Weight Update Correlated Noise Epoch Wise Double Descent

December 13, 2022

Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures
Antoine Bodin, Nicolas Macris
Gradient Flow Generalization Error Double Descent Aggregated Curve Multivariate Gaussian Epoch Wise Double Descent

September 21, 2022

Deep Double Descent via Smooth Interpolation
Matteo Gamba, Erik Englesson, Mårten Björkman, Hossein Azizpour
Deep Network Double Descent Generalization Capability Smooth Interpolation Epoch Wise Double Descent

December 6, 2021

Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki, Amartya Mitra, Yoshua Bengio, Guillaume Lajoie
DCU Insight AQ Double Descent Dynamic Optimization Epoch Wise Double Descent

Epoch Wise Double Descent

Papers

Towards Understanding Epoch-wise Double descent in Two-layer Linear Neural Networks

How Does Perfect Fitting Affect Representation Learning? On the Training Dynamics of Representations in Deep Neural Networks

Deep Double Descent for Time Series Forecasting: Avoiding Undertrained Models

Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances

Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures

Deep Double Descent via Smooth Interpolation

Multi-scale Feature Learning Dynamics: Insights for Double Descent