Heterogeneous Cluster
Heterogeneous clusters, encompassing diverse computing resources like CPUs and GPUs, are central to addressing the computational demands of modern machine learning, particularly for large language models (LLMs) and other deep learning tasks. Current research focuses on optimizing resource allocation and scheduling within these clusters to improve training efficiency and reduce energy consumption, often employing techniques like adaptive parallelism, model partitioning, and quantization. This work is crucial for advancing the capabilities of AI systems while mitigating the environmental and economic costs associated with their deployment, impacting fields ranging from scientific computing to cloud-based AI services.
Papers
September 26, 2024
April 25, 2024
April 21, 2024
March 24, 2024
March 2, 2024
June 25, 2023
June 22, 2023
May 23, 2023
February 13, 2023
May 19, 2022