Unstructured Pruning

Unstructured pruning aims to improve the efficiency of deep learning models by removing less important individual parameters (weights) without significantly impacting performance. Current research focuses on applying this technique to large language models (LLMs), convolutional neural networks (CNNs), and vision transformers (ViTs), often employing algorithms based on magnitude, gradient information, or activation patterns to guide the pruning process. This work is significant because it can reduce the computational cost and memory footprint of these large models, enabling their deployment on resource-constrained devices and accelerating inference speed, particularly for applications in natural language processing and computer vision.

Papers