Gradient Step
A gradient step, the fundamental update rule in many machine learning algorithms, is undergoing intense scrutiny, particularly regarding its role in feature learning and efficient training of large models. Current research focuses on optimizing gradient steps within various contexts, including federated learning (where sparse training and client-side acceleration are key), multi-task learning (addressing issues of gradient imbalance through techniques like loss and gradient balancing), and deep neural networks (analyzing the impact of single or few gradient steps on feature extraction and generalization). These investigations aim to improve model performance, reduce computational costs, and enhance our theoretical understanding of how these algorithms learn, impacting diverse applications from personalized medicine to robust AI systems.
Papers
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Behrad Moniri, Donghwan Lee, Hamed Hassani, Edgar Dobriban
Fed-GraB: Federated Long-tailed Learning with Self-Adjusting Gradient Balancer
Zikai Xiao, Zihan Chen, Songshang Liu, Hualiang Wang, Yang Feng, Jin Hao, Joey Tianyi Zhou, Jian Wu, Howard Hao Yang, Zuozhu Liu