Structured Sparsity
Structured sparsity in neural networks focuses on strategically removing parameters to reduce computational costs and memory footprint without significantly sacrificing performance. Current research emphasizes developing efficient algorithms for inducing and leveraging this sparsity in large language models (LLMs) and convolutional neural networks (CNNs), often employing techniques like N:M sparsity, block sparsity, and various pruning methods coupled with quantization. This area is crucial for deploying large models on resource-constrained devices and improving the efficiency of training and inference, impacting both the scalability of AI and its energy consumption.
Papers
A comprehensive study of spike and slab shrinkage priors for structurally sparse Bayesian neural networks
Sanket Jantre, Shrijita Bhattacharya, Tapabrata Maiti
Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm
Chengjing Wang, Peipei Tang, Wenling He, Meixia Lin