Pruning Method
Neural network pruning aims to reduce model size and computational cost without significant performance loss, primarily focusing on improving efficiency and resource utilization. Current research explores various pruning strategies, including unstructured and structured approaches applied to convolutional neural networks (CNNs), transformers, and large language models (LLMs), often incorporating techniques like knowledge distillation and Bayesian methods to enhance accuracy and speed. These advancements are significant for deploying deep learning models on resource-constrained devices and accelerating inference times, impacting both scientific research and practical applications across diverse fields.
Papers
November 10, 2024
November 4, 2024
October 21, 2024
October 17, 2024
October 9, 2024
September 2, 2024
August 8, 2024
August 7, 2024
July 19, 2024
June 18, 2024
June 6, 2024
June 3, 2024
May 27, 2024
May 26, 2024
April 1, 2024
March 29, 2024
March 28, 2024