Model Sparsification

Model sparsification aims to reduce the size and computational cost of machine learning models, particularly deep neural networks, without significant performance loss. Current research focuses on developing efficient algorithms for pruning model weights, often incorporating techniques like L0 regularization, Bayesian methods, and structured sparsity to achieve high sparsity levels in various architectures, including ResNets, Transformers, and other convolutional and graph neural networks. This work is driven by the need to deploy large models on resource-constrained devices and improve training efficiency in federated learning settings, impacting both the scalability of AI and its energy consumption.

Papers