Latency Predictor
Latency predictors are machine learning models designed to estimate the execution time of neural networks on various hardware platforms, crucial for optimizing neural architecture search and efficient deployment. Current research emphasizes improving prediction accuracy and generalizability across diverse hardware and network architectures, employing techniques like transfer learning, meta-learning, and regression networks with specialized encodings of network operations and hardware characteristics. Accurate latency prediction is vital for accelerating the development of efficient deep learning systems, enabling faster training and inference, particularly in resource-constrained environments like edge devices.
Papers
Data-driven Predictive Latency for 5G: A Theoretical and Experimental Analysis Using Network Measurements
Marco Skocaj, Francesca Conserva, Nicol Sarcone Grande, Andrea Orsi, Davide Micheli, Giorgio Ghinamo, Simone Bizzarri, Roberto Verdone
Performance Modeling of Data Storage Systems using Generative Models
Abdalaziz Rashid Al-Maeeni, Aziz Temirkhanov, Artem Ryzhikov, Mikhail Hushchyn