Natural Gradient
Natural gradients are a powerful tool in optimization, aiming to improve the efficiency and stability of training complex models by accounting for the underlying geometry of the parameter space. Current research focuses on applying natural gradient methods to diverse areas, including distributed learning (e.g., through gradient compression and efficient client selection), inverse problems (using diffusion models), and neural network training (e.g., via regularization and novel optimizers like DiffGrad and AdEMAMix). These advancements have significant implications for improving the performance and robustness of machine learning models across various applications, from image processing and medical image analysis to scientific computing and federated learning.
Papers
Natural gradient and parameter estimation for quantum Boltzmann machines
Dhrumil Patel, Mark M. Wilde
TrAct: Making First-layer Pre-Activations Trainable
Felix Petersen, Christian Borgelt, Stefano Ermon
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Ming Li, Yanhong Li, Tianyi Zhou