Network Pruning
Network pruning aims to reduce the size and computational cost of deep neural networks (DNNs) without significant performance loss, primarily by removing less important weights or connections. Current research focuses on developing efficient pruning algorithms for large language models (LLMs), convolutional neural networks (CNNs), and spiking neural networks (SNNs), often employing techniques like structured or unstructured pruning, and incorporating optimization methods to improve accuracy and speed. These advancements are crucial for deploying large-scale DNNs on resource-constrained devices, improving energy efficiency, and accelerating inference times across various applications.
Papers
Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning
Ziqi Zhou, Li Lian, Yilong Yin, Ze Wang
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good, Jiaqi Lin, Hannah Sieg, Mikey Ferguson, Xin Yu, Shandian Zhe, Jerzy Wieczorek, Thiago Serra