Gradient Scaling

Gradient scaling techniques aim to improve the efficiency and effectiveness of training various neural network architectures by addressing issues arising from imbalanced gradient magnitudes or interference across tasks. Current research focuses on adapting gradient scaling to deep spiking neural networks, incrementally growing networks, and continual learning scenarios, often employing methods like learning rate adaptation, orthogonal projections, and variance transfer. These advancements are significant for accelerating training, enhancing model performance, and mitigating catastrophic forgetting in continual learning, with implications for diverse applications ranging from image classification to biophysical modeling and federated learning.

Papers