Full Batch Gradient Descent

Full-batch gradient descent (GD) is an optimization algorithm that updates model parameters using the entire dataset in each iteration, aiming for high accuracy and stable convergence. Current research focuses on mitigating its computational limitations, particularly for large datasets and complex models like Graph Neural Networks (GNNs), through techniques such as feature slicing, adaptive caching, and optimized distributed training schemes. These efforts aim to improve the scalability and efficiency of full-batch GD, making it a viable option for increasingly large-scale machine learning tasks while addressing challenges related to memory constraints and communication overhead. The resulting advancements have implications for various applications, including improved training of deep learning models and enhanced performance in specific domains like graph analysis.

Papers