Latency Predictor

Latency predictors are machine learning models designed to estimate the execution time of neural networks on various hardware platforms, crucial for optimizing neural architecture search and efficient deployment. Current research emphasizes improving prediction accuracy and generalizability across diverse hardware and network architectures, employing techniques like transfer learning, meta-learning, and regression networks with specialized encodings of network operations and hardware characteristics. Accurate latency prediction is vital for accelerating the development of efficient deep learning systems, enabling faster training and inference, particularly in resource-constrained environments like edge devices.

Papers