Sparse Training

Sparse training aims to reduce the computational cost and memory footprint of deep neural networks by training models with significantly fewer parameters, while maintaining or even improving accuracy. Current research focuses on developing efficient algorithms for creating and training sparse models, including methods for dynamic sparsity adjustment, improved initialization techniques, and hardware-accelerated computations, often applied to transformer and convolutional neural networks. These advancements are significant because they enable the deployment of large-scale models on resource-constrained devices and reduce the environmental impact of training, impacting both scientific research and practical applications in various fields.

Papers

May 31, 2023

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability
Robert Tjarko Lange, Henning Sprekeler
Lottery Ticket Sparse Training Evolutionary Optimization Backpropagation Free

May 28, 2023

Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi, Mykola Pechenizkiy, Raymond Veldhuis, Decebal Constantin Mocanu
Training Data Transformer Megatron Decepticons Sparse Neural Network Sparse Training Efficient Time Series Forecasting Forecasting Analysis Adaptive Sparsity Sparsity Tradeoff

May 3, 2023

Dynamic Sparse Training with Structured Sparsity
Mike Lasby, Anna Golubeva, Utku Evci, Mihai Nica, Yani Ioannou
Structured Sparsity Sparse Training Dynamic Sparse Training Unstructured Sparsity

April 27, 2023

JaxPruner: A concise library for sparsity research
Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci
Sparse Neural Network Sparse Training

April 15, 2023

SalientGrads: Sparse Models for Communication Efficient and Data Aware Distributed Federated Training
Riyasat Ohib, Bishal Thapaliya, Pratyush Gaggenapalli, Jingyu Liu, Vince Calhoun, Sergey Plis
Sparse Model Communication Efficiency Sparse Training Federated Training Sparse Gradient

April 14, 2023

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks
Abhisek Kundu, Naveen K. Mellempudi, Dharma Teja Vooturi, Bharat Kaul, Pradeep Dubey
Deep Neural Network Sparse Training Spatial Annealing Smoothing Sparsity Inducing

February 18, 2023

Calibrating the Rigged Lottery: Making All Tickets Reliable
Bowen Lei, Ruqi Zhang, Dongkuan Xu, Bani Mallick
Sparse Model Confidence Calibration Sparse Training Robust Ticket

February 9, 2023

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks
Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan Alistarh
Neural Network Back Propagation Sparsity Increase Sparse Network Faster Training Sparse Training Network Sparsity Sparse Backpropagation

February 6, 2023

Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers
Shiwei Liu, Zhangyang Wang
Critical Lesson Sparse Neural Network Sparse Training Dynamic Sparsity Sparse to Sparse Training Sparse Network Training Post Training Sparsity

January 9, 2023

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
Bowen Lei, Dongkuan Xu, Ruqi Zhang, Shuren He, Bani K. Mallick
Balancing Strategy Sparse Training Sparsity Constraint Gradient Prediction Training Epoch

December 2, 2022

Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training?
Antoine Vanderschueren, Christophe De Vleeschouwer
Balancing Weight Gradient Estimation Sparse Training Soft Thresholding Sparsity Ratio

November 30, 2022

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding
Sparse Model Sparse Training Dynamic Sparse Training Exploration Exploitation Trade

November 29, 2022

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Trevor Gale, Deepak Narayanan, Cliff Young, Matei Zaharia
Mixture of Expert Sparse Training Sparse Operation GPU Kernel

November 26, 2022

Where to Pay Attention in Sparse Training for Feature Selection?
Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu
Human Attention Feature Selection Network Topology Sparse Training Sparse Autoencoders Dimensional Feature State of the Art Unsupervised

November 14, 2022

SNIPER Training: Single-Shot Sparse Training for Text-to-Speech
Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien Herremans
Training Data Text to Speech Speech to Text Sparse Training High Sparsity Dense Training Sparsity Aware

September 22, 2022

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training
Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren
Efficient Deep Sparse Training Generic Model Non Contiguous Piece Layer Freezing

September 20, 2022

SparCL: Sparse Continual Learning on the Edge
Zifeng Wang, Zheng Zhan, Yifan Gong, Geng Yuan, Wei Niu, Tong Jian, Bin Ren, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy
Continual LEArning Extreme Edge Sparse Training Sparsity Ratio

September 11, 2022

Towards Sparsification of Graph Neural Networks
Hongwu Peng, Deniz Gurevin, Shaoyi Huang, Tong Geng, Weiwen Jiang, Omer Khan, Caiwen Ding
Graph Neural Network Model Compression Random Sparsification Sparse Training

August 11, 2022

Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment
Jie Zhu, Leye Wang, Xiao Han
System Performance Membership Inference Attack Model Compression Human SAFETY Sparse Training Concrete Workability AI Deployment

July 18, 2022

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks
Chuang Liu, Xueqi Ma, Yibing Zhan, Liang Ding, Dapeng Tao, Bo Du, Wenbin Hu, Danilo Mandic
Graph Neural Network GNN Architecture Sparse Training Large Scale Graph Datasets Open Graph Benchmark Graph Pruning

Sparse Training

Papers

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability

Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers

Dynamic Sparse Training with Structured Sparsity

JaxPruner: A concise library for sparsity research

SalientGrads: Sparse Models for Communication Efficient and Data Aware Distributed Federated Training

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks

Calibrating the Rigged Lottery: Making All Tickets Reliable

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training?

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts

Where to Pay Attention in Sparse Training for Feature Selection?

SNIPER Training: Single-Shot Sparse Training for Text-to-Speech

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

SparCL: Sparse Continual Learning on the Edge

Towards Sparsification of Graph Neural Networks

Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks