Stochastic First Order Oracle

Stochastic First-Order Oracles (SFOs) are fundamental to optimization algorithms used in machine learning, particularly for minimizing complex functions where exact gradients are unavailable. Current research focuses on understanding and minimizing the SFO complexity—the number of gradient computations needed to achieve a desired solution accuracy—across various algorithms like stochastic gradient descent (SGD) and its variants, exploring the impact of batch size and learning rate strategies. This research is crucial for improving the efficiency and scalability of machine learning models, particularly in large-scale applications like deep learning and federated learning where computational cost is a major bottleneck.

Papers