Structured Sparsity
Structured sparsity in neural networks focuses on strategically removing parameters to reduce computational costs and memory footprint without significantly sacrificing performance. Current research emphasizes developing efficient algorithms for inducing and leveraging this sparsity in large language models (LLMs) and convolutional neural networks (CNNs), often employing techniques like N:M sparsity, block sparsity, and various pruning methods coupled with quantization. This area is crucial for deploying large models on resource-constrained devices and improving the efficiency of training and inference, impacting both the scalability of AI and its energy consumption.
Papers
September 15, 2022
September 9, 2022
July 9, 2022
July 8, 2022
June 16, 2022
June 15, 2022
June 14, 2022
May 27, 2022
April 26, 2022
April 13, 2022
April 6, 2022
March 14, 2022
February 9, 2022
February 1, 2022
January 18, 2022
December 5, 2021