Batch Optimization
Batch optimization techniques aim to accelerate the training of large neural networks by processing data in larger batches, thereby leveraging parallel computing resources. Current research focuses on adapting these techniques to various model architectures and tasks, including contrastive learning, reinforcement learning, and dense visual prediction, often employing algorithms like LARS, LAMB, and novel approaches like AGVM to mitigate challenges like gradient variance misalignment. These advancements significantly reduce training time for complex models, impacting fields like computer vision and natural language processing by enabling faster development and deployment of high-performing AI systems.
Papers
Rebalancing Batch Normalization for Exemplar-based Class-Incremental Learning
Sungmin Cha, Sungjun Cho, Dasol Hwang, Sunwon Hong, Moontae Lee, Taesup Moon
ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise
Minjia Zhang, Niranjan Uma Naresh, Yuxiong He