Generalized Smoothness

Generalized smoothness relaxes traditional assumptions about the smoothness of functions in optimization problems, particularly relevant for deep learning where functions are often non-convex and exhibit varying smoothness depending on the gradient norm. Current research focuses on developing and analyzing optimization algorithms (like gradient descent, quasi-Newton methods, and optimistic mirror descent) under these generalized conditions, often applied to neural networks (including LSTMs and Transformers) and multi-objective optimization problems. This work aims to improve the efficiency and robustness of training algorithms and provide stronger theoretical guarantees for convergence, impacting both theoretical understanding and practical applications in machine learning.

Papers