Network Compression
Network compression aims to reduce the size and computational cost of deep neural networks (DNNs) without significant performance loss. Current research focuses on techniques like pruning (removing less important connections), quantization (reducing the precision of weights), and low-rank approximations, often applied during training or post-training, and applied to various architectures including CNNs, GANs, and transformers. These advancements are crucial for deploying large-scale DNNs on resource-constrained devices and improving the efficiency of training and inference, impacting both scientific understanding of DNNs and their practical applications across diverse fields.
Papers
May 17, 2022
May 13, 2022
April 12, 2022
February 16, 2022
January 30, 2022
January 25, 2022
January 17, 2022
December 19, 2021
December 10, 2021