CNN Inference
CNN inference focuses on efficiently executing pre-trained convolutional neural networks (CNNs), aiming to minimize computational cost and latency while maintaining accuracy. Current research emphasizes optimizing inference for resource-constrained environments like microcontrollers and edge devices, exploring techniques such as sparse computations (e.g., processing only frame differences), model compression (e.g., using autoencoders and attention mechanisms), and distributed inference across multiple devices. These advancements are crucial for deploying CNNs in real-time applications like robotics, IoT, and mobile computing, improving efficiency and expanding the scope of deep learning deployments.
Papers
October 16, 2024
August 17, 2024
June 7, 2024
June 6, 2024
April 8, 2024
January 19, 2024
October 25, 2023
August 29, 2023
August 10, 2023
March 8, 2023
February 19, 2023
December 15, 2022
November 24, 2022
October 18, 2022
July 22, 2022
July 20, 2022
March 8, 2022