Auto Scaling
Auto-scaling dynamically adjusts computing resources to meet fluctuating demands, aiming to optimize resource utilization and maintain service level objectives. Current research focuses on developing more sophisticated prediction models, including those based on recurrent neural networks, graph neural networks, and meta-reinforcement learning, to anticipate workload changes and improve the accuracy and efficiency of scaling decisions. These advancements are crucial for managing the ever-increasing complexity of cloud-based systems and large-scale machine learning training, leading to cost savings and improved performance in diverse applications ranging from serverless functions to large language model training.
Papers
November 1, 2024
October 19, 2024
August 30, 2024
July 29, 2024
June 20, 2024
October 26, 2023
October 5, 2023
August 11, 2023
July 29, 2023
March 7, 2023
September 6, 2022
May 31, 2022
February 24, 2022