Model Pruning
Model pruning aims to reduce the computational cost and memory footprint of large neural networks by removing less important parameters, while preserving or even improving performance. Current research focuses on developing efficient one-shot pruning methods, particularly for large language models (LLMs) and vision transformers (ViTs), often incorporating techniques like gradient-based importance scoring, block-aware optimization, and prompt-based approaches. These advancements are crucial for deploying sophisticated AI models on resource-constrained devices and improving the efficiency of training and inference, impacting both scientific understanding of model architectures and practical applications across various domains.
Papers
March 10, 2023
March 3, 2023
January 27, 2023
December 19, 2022
November 18, 2022
October 25, 2022
October 24, 2022
October 12, 2022
October 8, 2022
August 24, 2022
July 4, 2022
June 16, 2022
June 12, 2022
March 30, 2022
March 8, 2022
February 23, 2022