Gradient Regularization
Gradient regularization (GR) techniques aim to improve the performance and robustness of machine learning models, primarily by penalizing the magnitude or variance of gradients during training. Current research focuses on developing differentiable GR methods applicable to various architectures (including neural networks, recurrent units, and transformers), exploring their interaction with optimization algorithms (like Adam and gradient descent), and investigating their effectiveness in diverse applications such as image processing, time series forecasting, and natural language processing. The impact of GR lies in its ability to enhance model generalization, mitigate overfitting, and improve numerical stability, leading to more reliable and accurate predictions across various domains.
Papers
IBP Regularization for Verified Adversarial Robustness via Branch-and-Bound
Alessandro De Palma, Rudy Bunel, Krishnamurthy Dvijotham, M. Pawan Kumar, Robert Stanforth
From Kernel Methods to Neural Networks: A Unifying Variational Formulation
Michael Unser
RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness
Francesco Pinto, Harry Yang, Ser-Nam Lim, Philip H. S. Torr, Puneet K. Dokania