Table Sharding

Table sharding, the partitioning of large datasets or model parameters across multiple computing resources, aims to improve the scalability and efficiency of machine learning and distributed systems. Current research focuses on developing efficient sharding algorithms, including those leveraging neural cost models for optimal partitioning and employing techniques like coded computing and quantization to reduce communication overhead and storage needs. These advancements are crucial for enabling the training and deployment of increasingly large models, particularly in resource-constrained environments, and are impacting various applications from large language models to recommender systems.

Papers