Energy Efficient Inference
Energy-efficient inference focuses on minimizing the computational resources and power consumption required to run deep learning models, particularly at the edge where resources are limited. Current research emphasizes techniques like model compression (e.g., pruning, quantization, knowledge distillation), efficient algorithms (e.g., spiking neural networks, dynamic decision trees), and hardware-aware optimization (e.g., mapping DNNs to multi-accelerator SoCs, specialized hardware accelerators). These advancements are crucial for deploying AI in resource-constrained environments like embedded systems and IoT devices, reducing the environmental impact of AI, and enabling broader accessibility to AI applications.
Papers
May 27, 2022
May 24, 2022
April 22, 2022
April 11, 2022
February 4, 2022
January 20, 2022
November 30, 2021