Adaptive Inference
Adaptive inference aims to optimize the efficiency of machine learning inference by dynamically adjusting model resources based on the characteristics of individual input data. Current research focuses on developing efficient algorithms and architectures, such as early-exit networks, cascaded ensembles, and dynamic sub-networks within transformers, to achieve this adaptation, often leveraging input-dependent routing or layer selection mechanisms. This approach holds significant promise for reducing computational costs, energy consumption, and latency in various applications, particularly in resource-constrained environments like edge devices and IoT platforms, while maintaining accuracy.
Papers
July 2, 2024
March 23, 2024
March 12, 2024
February 6, 2024
October 18, 2023
July 26, 2023
July 17, 2023
July 14, 2023
June 28, 2023
June 6, 2023
June 4, 2023
May 18, 2023
April 17, 2023
March 20, 2023
June 24, 2022
June 17, 2022
April 22, 2022