Edge Inference

Edge inference focuses on performing machine learning inference directly on resource-constrained devices at the network edge, aiming to reduce latency, bandwidth consumption, and privacy concerns associated with cloud-based processing. Current research emphasizes efficient model architectures (like Vision Transformers and MobileNets), optimization techniques (including quantization, pruning, and model merging), and intelligent task offloading strategies to balance accuracy and resource usage. This field is crucial for enabling real-time AI applications in diverse areas such as video analytics, natural language processing, and robotics, driving advancements in both hardware and software for efficient AI deployment.

Papers

June 21, 2023

Edge Devices Inference Performance Comparison
R. Tobiasz, G. Wilczyński, P. Graszka, N. Czechowski, S. Łuczak
Edge AI Edge Inference EfficientNet Algorithm Neuromorphic Research Chip Loihi

June 1, 2023

Integrated Sensing-Communication-Computation for Edge Artificial Intelligence
Dingzhu Wen, Xiaoyang Li, Yong Zhou, Yuanming Shi, Sheng Wu, Chunxiao Jiang
Edge AI Edge Learning Edge Inference

May 18, 2023

Lyapunov-Driven Deep Reinforcement Learning for Edge Inference Empowered by Reconfigurable Intelligent Surfaces
Kyriakos Stylianopoulos, Mattia Merluzzi, Paolo Di Lorenzo, George C. Alexandropoulos
Deep Reinforcement Learning Time Varying Propagation Environment Edge Inference Reconfigurable Intelligent Surface Wireless Edge Edge Classification

May 1, 2023

BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms
Ziyang Zhang, Huan Li, Yang Zhao, Changyao Lin, Jie Liu
Deep Neural Network Learning Based Edge Inference DNN Inference Maximum Entropy Reinforcement Learning Edge Platform

March 25, 2023

Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures
Zirui Fu, Aleksandre Avaliani, Marco Donato
Edge Device Task Adaptation Transformer Based Deep Edge Inference AI Accelerator Multi Task Inference Heterogeneous Memory

December 28, 2022

Comparative Study of Parameter Selection for Enhanced Edge Inference for a Multi-Output Regression model for Head Pose Estimation
Asiri Lindamulage, Nuwan Kodagoda, Shyam Reyal, Pradeepa Samarasinghe, Pratheepan Yogarajah
Deep Learning Model Comparative Study Model Size Edge Inference Head Pose Estimation Parameter Selection Multi Output Regression Magnitude Based Pruning

December 21, 2022

Structure-guided Image Outpainting
Xi Wang, Weixi Cheng, Wenliang Jia
Generative Adversarial Network Edge Inference Scene Extrapolation Image Outpainting

December 8, 2022

Vicious Classifiers: Assessing Inference-time Data Reconstruction Risk in Edge Computing
Mohammad Malekzadeh, Deniz Gunduz
Scientific Inference Generative Adversarial Inference Time Edge Inference Data Reconstruction Attack Data Reconstruction Target Inference

November 25, 2022

Task-Oriented Communication for Edge Video Analytics
Jiawei Shao, Xinjie Zhang, Jun Zhang
Timely Communication Video Analytics Edge Inference Video Analytics Task

November 15, 2022

Enabling AI Quality Control via Feature Hierarchical Edge Inference
Jinhyuk Choi, Seong-Lyun Kim, Seung-Woo Ko
Edge Computing Edge Inference AI Quality

September 21, 2022

Robust Information Bottleneck for Task-Oriented Communication with Digital Modulation
Songjie Xie, Shuai Ma, Ming Ding, Yuanming Shi, Mingjian Tang, Youlong Wu
Information Bottleneck Digital Modulation Joint Source Channel Coding Edge Inference Task Oriented Communication Robust Communication

August 24, 2022

Efficient Heterogeneous Video Segmentation at the Edge
Jamie Menjay Lin, Siargey Pisarchyk, Juhyun Lee, David Tian, Tingbo Hou, Karthik Raveendran, Raman Sarokin, George Sung, Trent Tolley, Matthias Grundmann
Neural Architecture Extreme Edge Edge Device Video Segmentation Heterogeneous Computing Edge Inference

August 22, 2022

A Generalization of the Shortest Path Problem to Graphs with Multiple Edge-Cost Estimates
Eyal Weiss, Ariel Felner, Gal A. Kaminka
Strong Generalization Graph Drawing Practical Algorithm Lower Bound Shortest Path Edge Inference Edge Weight

August 4, 2022

Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution
Humberto Carvalho, Pavel Zaykov, Asim Ukaye
Deep Neural Network DNN Model Edge Inference Performance Optimization AI Revolution Ecosystem Level Analysis DNN Optimization

June 15, 2022

Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
Clemens JS Schaefer, Siddharth Joshi, Shan Li, Raul Blazquez
Quantization Technique Quantization Error Closed Form Differentiable Expression Mixed Precision Edge Inference Quantization Learning Heterogeneous Quantization

March 3, 2022

Weightless Neural Networks for Efficient Edge Inference
Zachary Susskind, Aman Arora, Igor Dantas Dos Santos Miranda, Luis Armando Quintanilla Villon, Rafael Fontella Katopodis, Leandro Santiago de Araujo, Diego Leonel Cadette Dutra, Priscila Machado Vieira Lima, Felipe Maia Galvao Franca, Mauricio Breternitz, Lizy K. John
Neural Network Architecture DNN Accelerator Edge Inference SNN Architecture

January 19, 2022

GEMEL: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
Arthi Padmanabhan, Neil Agarwal, Anand Iyer, Ganesh Ananthanarayanan, Yuanchao Shu, Nikolaos Karianakis, Guoqing Harry Xu, Ravi Netravali
Extreme Edge Model Merging Video Analytics Edge Inference Video Analytics Pipeline

January 18, 2022

DEFER: Distributed Edge Inference for Deep Neural Networks
Arjun Parthasarathy, Bhaskar Krishnamachari
Deep Neural Network Delayed Feedback Device Inference Edge Inference Compute Node

Edge Inference

Papers

Edge Devices Inference Performance Comparison

Integrated Sensing-Communication-Computation for Edge Artificial Intelligence

Lyapunov-Driven Deep Reinforcement Learning for Edge Inference Empowered by Reconfigurable Intelligent Surfaces

BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms

Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures

Comparative Study of Parameter Selection for Enhanced Edge Inference for a Multi-Output Regression model for Head Pose Estimation

Structure-guided Image Outpainting

Vicious Classifiers: Assessing Inference-time Data Reconstruction Risk in Edge Computing

Task-Oriented Communication for Edge Video Analytics

Enabling AI Quality Control via Feature Hierarchical Edge Inference

Robust Information Bottleneck for Task-Oriented Communication with Digital Modulation

Efficient Heterogeneous Video Segmentation at the Edge

A Generalization of the Shortest Path Problem to Graphs with Multiple Edge-Cost Estimates

Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution

Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks

Weightless Neural Networks for Efficient Edge Inference

GEMEL: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge

DEFER: Distributed Edge Inference for Deep Neural Networks