Edge Inference
Edge inference focuses on performing machine learning inference directly on resource-constrained devices at the network edge, aiming to reduce latency, bandwidth consumption, and privacy concerns associated with cloud-based processing. Current research emphasizes efficient model architectures (like Vision Transformers and MobileNets), optimization techniques (including quantization, pruning, and model merging), and intelligent task offloading strategies to balance accuracy and resource usage. This field is crucial for enabling real-time AI applications in diverse areas such as video analytics, natural language processing, and robotics, driving advancements in both hardware and software for efficient AI deployment.
Papers
January 1, 2025
December 30, 2024
December 27, 2024
December 1, 2024
November 25, 2024
November 8, 2024
October 22, 2024
August 30, 2024
June 17, 2024
June 11, 2024
June 7, 2024
April 14, 2024
March 28, 2024
March 26, 2024
February 24, 2024
February 23, 2024
November 9, 2023
October 27, 2023
September 28, 2023