Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is an iterative optimization algorithm used to find the minimum of a function, particularly useful in machine learning for training large models where computing the exact gradient is computationally prohibitive. Current research focuses on improving SGD's efficiency and convergence properties, exploring variations like Adam, incorporating techniques such as momentum, adaptive learning rates, and line search methods, and analyzing its behavior in high-dimensional and non-convex settings. These advancements are crucial for training complex models like deep neural networks and improving the performance of various machine learning applications, impacting fields ranging from natural language processing to healthcare.
Papers
UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu
Maaz Amjad, Sabur Butt, Hamza Imam Amjad, Grigori Sidorov, Alisa Zhila, Alexander Gelbukh
Stochastic Gradient Descent and Anomaly of Variance-flatness Relation in Artificial Neural Networks
Xia Xiong, Yong-Cong Chen, Chunxiao Shi, Ping Ao
On uniform-in-time diffusion approximation for stochastic gradient descent
Lei Li, Yuliang Wang
On the Maximum Hessian Eigenvalue and Generalization
Simran Kaur, Jeremy Cohen, Zachary C. Lipton
sqSGD: Locally Private and Communication Efficient Federated Learning
Yan Feng, Tao Xiong, Ruofan Wu, LingJuan Lv, Leilei Shi
A General Theory for Federated Optimization with Asynchronous and Heterogeneous Clients Updates
Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi