Layerwise Sparsity
Layerwise sparsity focuses on strategically removing less important parameters or even entire layers within neural networks to reduce computational costs and memory demands without significant performance loss. Current research explores various methods for achieving this, including structured sparsity, dynamic layer routing (as seen in Radial Networks), and blockwise pruning techniques like BESA, often tailored to specific architectures such as Vision Transformers and Large Language Models. This research is significant because it addresses the growing need for efficient and deployable deep learning models, particularly for resource-constrained environments and large-scale applications.
Papers
October 14, 2024
June 5, 2024
April 7, 2024
February 18, 2024
October 8, 2023
April 26, 2023
March 30, 2023
November 13, 2022
September 9, 2022
February 5, 2022