Zeroth Order Optimization

Zeroth-order optimization (ZO) tackles the challenge of optimizing functions where gradients are unavailable or computationally expensive, a common hurdle in complex machine learning problems. Current research focuses on improving the efficiency and scalability of ZO methods, particularly for large language model (LLM) fine-tuning and federated learning, employing techniques like randomized gradient estimation and sparse parameter updates within algorithms such as ZO-SGD and MeZO. These advancements are significant because they enable memory-efficient training of large models on resource-constrained devices and facilitate privacy-preserving collaborative learning, impacting diverse fields from drug discovery to reinforcement learning.

Papers