Model Pruning
Model pruning aims to reduce the computational cost and memory footprint of large neural networks by removing less important parameters, while preserving or even improving performance. Current research focuses on developing efficient one-shot pruning methods, particularly for large language models (LLMs) and vision transformers (ViTs), often incorporating techniques like gradient-based importance scoring, block-aware optimization, and prompt-based approaches. These advancements are crucial for deploying sophisticated AI models on resource-constrained devices and improving the efficiency of training and inference, impacting both scientific understanding of model architectures and practical applications across various domains.
Papers
December 6, 2023
December 3, 2023
November 9, 2023
November 1, 2023
October 31, 2023
October 30, 2023
September 10, 2023
September 4, 2023
August 16, 2023
August 5, 2023
August 3, 2023
July 22, 2023
July 10, 2023
May 31, 2023
May 28, 2023
May 24, 2023
May 15, 2023
April 19, 2023
March 14, 2023