Dynamic Sparse Training
Dynamic sparse training (DST) aims to improve the efficiency of deep learning models by training sparse neural networks from scratch, dynamically adjusting the network's connectivity during training to achieve high sparsity levels while maintaining accuracy comparable to dense models. Current research focuses on optimizing pruning and growing criteria, exploring different sparsity structures (e.g., unstructured, channel-level, N:M), and adapting DST to various architectures (including transformers and GANs) and learning paradigms (like federated learning and continual learning). This approach offers significant potential for reducing computational costs and memory requirements in various applications, from natural language processing and computer vision to resource-constrained environments like edge devices and embedded systems.
Papers
Automatic Noise Filtering with Dynamic Sparse Training in Deep Reinforcement Learning
Bram Grooten, Ghada Sokar, Shibhansh Dohare, Elena Mocanu, Matthew E. Taylor, Mykola Pechenizkiy, Decebal Constantin Mocanu
Bi-directional Masks for Efficient N:M Sparse Training
Yuxin Zhang, Yiting Luo, Mingbao Lin, Yunshan Zhong, Jingjing Xie, Fei Chao, Rongrong Ji