Pruning Framework

Pruning frameworks aim to reduce the computational cost and memory footprint of deep neural networks (DNNs) by removing less important parameters or connections, while preserving or even improving accuracy. Current research focuses on developing efficient pruning algorithms for various architectures, including convolutional neural networks (CNNs), vision transformers (ViTs), and large language models (LLMs), exploring both pre-training and post-training approaches, and investigating structured versus unstructured pruning strategies. These advancements are significant because they enable the deployment of larger and more accurate models on resource-constrained devices, impacting fields like computer vision, natural language processing, and edge computing.

Papers