Sharpness Dynamic

Sharpness dynamics in neural network training focuses on understanding how the curvature of the loss landscape, specifically the largest eigenvalue of the Hessian matrix (sharpness), evolves during optimization. Current research investigates the relationship between sharpness and generalization performance, exploring how algorithms like Sharpness-Aware Minimization (SAM) can be improved to find flatter minima and enhance model robustness. This research is significant because it helps explain the observed transferability of hyperparameters across different model scales and sheds light on the optimization dynamics leading to improved generalization and efficient training of large models, impacting both theoretical understanding and practical applications like model quantization.

Papers

September 20, 2024

Bilateral Sharpness-Aware Minimization for Flatter Minima
Jiaxin Deng, Junbiao Pang, Baochang Zhang, Qingming Huang
Sharpness Aware Minimization Improved Generalization Flat Minimum Sharpness Dynamic

February 27, 2024

Super Consistency of Neural Network Landscapes and Learning Rate Transfer
Lorenzo Noci, Alexandru Meterez, Thomas Hofmann, Antonio Orvieto
Deep Learning Optimization Purpose Learning Rate Neural Tangent Kernel Hessian Matrix Benchmark Image Scaling Limit Sharpness Dynamic

November 3, 2023

Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra, Tianyu He, Maissam Barkeshli
Neural Network Core Stability Kill Chaos Fixed Point Gradient Descent Dynamic Sharpness Reduction Sharpness Dynamic

May 1, 2023

Venn Diagram Multi-label Class Interpretation of Diabetic Foot Ulcer with Color and Sharpness Enhancement
Md Mahamudul Hasan, Moi Hoon Yap, Md Kamrul Hasan
Multi Label MAESTRO Dataset Multi Label Classification Multi Class Classification Color Object Diabetic Foot Ulcer Wound Image Sharpness Dynamic

October 13, 2022

SQuAT: Sharpness- and Quantization-Aware Training for BERT
Zheng Wang, Juncheng B Li, Shuhui Qu, Florian Metze, Emma Strubell
BERT Model Ticket BERT Quantization Operator Quantization Aware Training Layer Wise Quantization Sharpness Aware Training Sharpness Dynamic

July 26, 2022

Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Zhouzi Li, Zixuan Wang, Jian Li
Extreme Edge Core Stability Linear Neural Network Modern Neural Network Full Batch Gradient Descent Pan Sharpening Sharpness Dynamic