Heterogeneous GPU
Heterogeneous GPU computing focuses on efficiently utilizing diverse GPU types within a single system for tasks like large language model (LLM) training and inference. Current research emphasizes optimizing resource allocation and scheduling across heterogeneous hardware, employing techniques like max-flow algorithms, reinforcement learning for resource partitioning, and adaptive batch sizes in stochastic gradient descent. This work is crucial for reducing the cost and improving the performance of computationally intensive applications, particularly in AI and high-performance computing, by enabling the use of a wider range of available hardware resources.
Papers
October 22, 2024
June 3, 2024
May 25, 2024
May 14, 2024
April 22, 2024
March 24, 2024
March 2, 2024
February 6, 2024
November 22, 2023
November 17, 2023
August 29, 2023
January 1, 2023
October 8, 2022