Latency Prediction
Latency prediction focuses on accurately estimating the execution time of neural networks or tensor programs on various hardware platforms, a crucial task for optimizing deep learning model deployment. Current research emphasizes developing robust and efficient prediction models, often employing machine learning techniques like neural networks and leveraging transfer learning or domain adaptation to improve generalization across different hardware and software configurations. These advancements are significant for accelerating neural architecture search, optimizing database systems, and improving the efficiency of resource-constrained edge devices by enabling faster and more informed decisions about model selection and deployment.
Papers
October 8, 2024
March 4, 2024
November 16, 2023
June 25, 2023
April 25, 2023
January 31, 2023
October 12, 2022
October 6, 2022
May 25, 2022