Parameter Pruning

Parameter pruning aims to reduce the size and computational cost of large neural networks, like large language models (LLMs) and convolutional neural networks (CNNs), without significantly sacrificing performance. Current research focuses on developing efficient pruning algorithms, including those based on magnitude, layer merging, and concurrent training-pruning, often applied to specific architectures like Transformers and ResNets. These advancements are crucial for deploying complex models on resource-constrained devices and mitigating the environmental impact of large-scale training, impacting both scientific research and practical applications in various fields.

Papers