Latency Prediction

Latency prediction focuses on accurately estimating the execution time of neural networks or tensor programs on various hardware platforms, a crucial task for optimizing deep learning model deployment. Current research emphasizes developing robust and efficient prediction models, often employing machine learning techniques like neural networks and leveraging transfer learning or domain adaptation to improve generalization across different hardware and software configurations. These advancements are significant for accelerating neural architecture search, optimizing database systems, and improving the efficiency of resource-constrained edge devices by enabling faster and more informed decisions about model selection and deployment.

Papers