Structured Pruning
Structured pruning is a model compression technique aiming to reduce the computational cost and memory footprint of deep neural networks (DNNs) by removing entire groups of parameters, such as neurons or filter channels, while preserving performance. Current research focuses on developing efficient algorithms for structured pruning across various architectures, including convolutional neural networks (CNNs), vision transformers (ViTs), and large language models (LLMs), often incorporating techniques like knowledge distillation and one-shot pruning to minimize retraining overhead. This work is significant because it enables the deployment of powerful DNNs on resource-constrained devices, improving the efficiency and accessibility of deep learning applications in diverse fields.
Papers
Learning Coarse-to-Fine Pruning of Graph Convolutional Networks for Skeleton-based Recognition
Hichem Sahbi
A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting
Nicholas Kiefer, Arvid Weyrauch, Muhammed Öz, Achim Streit, Markus Götz, Charlotte Debus
Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation
Dongyue Wu, Zilin Guo, Li Yu, Nong Sang, Changxin Gao