RNN Inference

RNN inference focuses on efficiently executing recurrent neural networks, aiming to reduce computational cost and energy consumption while maintaining accuracy. Current research emphasizes optimizing RNN architectures like LSTMs and GRUs, exploring techniques such as weight and activity sparsity, and developing specialized hardware accelerators for improved performance. These advancements are crucial for deploying RNNs in resource-constrained environments and enabling real-time applications in areas like network traffic analysis and speech recognition, where speed and energy efficiency are paramount.

Papers

March 17, 2024

Brain-on-Switch: Towards Advanced Intelligent Network Data Plane via NN-Driven Traffic Analysis at Line-Speed
Jinzhu Yan, Haotian Xu, Zhuotao Liu, Qi Li, Ke Xu, Mingwei Xu, Jianping Wu
Traffic Data Speed Effect RNN Based Model RNN Inference

November 13, 2023

Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference
Rishav Mukherji, Mark Schöne, Khaleelulla Khan Nazeer, Christian Mayr, Anand Subramoney
Neuromorphic Computing Activation Sparsity Sparse Activation RNN Inference

October 29, 2022

Accelerating RNN-T Training and Inference Using CTC guidance
Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani
Scientific Inference Connectionist Temporal Classification Mt RNN Recurrent Neural Network Transducer RNN T Training RNN Inference

February 14, 2022

Vau da muntanialas: Energy-efficient multi-die scalable acceleration of RNN inference
Gianna Paulin, Francesco Conti, Lukas Cavigelli, Luca Benini
Traditional RNNs RNN Inference

RNN Inference

Papers

Brain-on-Switch: Towards Advanced Intelligent Network Data Plane via NN-Driven Traffic Analysis at Line-Speed

Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference

Accelerating RNN-T Training and Inference Using CTC guidance

Vau da muntanialas: Energy-efficient multi-die scalable acceleration of RNN inference