Block Coordinate Descent

Block coordinate descent (BCD) is an optimization technique that iteratively updates subsets of variables in a large optimization problem, improving efficiency by reducing computational complexity at each step. Current research focuses on applying BCD to various machine learning tasks, including training large language models (LLMs), solving regularized regression problems, and performing multi-objective optimization, often incorporating it within algorithms like Adam or within hierarchical Bayesian frameworks. This approach is particularly valuable for handling massive datasets and high-dimensional models, leading to significant improvements in training speed and memory efficiency for applications ranging from natural language processing to personalized medicine and distributed systems.

Papers