Activation Compression
Activation compression aims to reduce the memory footprint of deep neural networks (DNNs) during training and inference, addressing the escalating resource demands of large models like Transformers and Graph Neural Networks. Current research focuses on techniques like quantization, pruning, and wavelet transforms, applied to various architectures, exploring both lossy and lossless compression methods to minimize memory usage while preserving model accuracy. These advancements are crucial for enabling the training and deployment of increasingly complex DNNs on resource-constrained devices and accelerating the training process on large-scale datasets.
Papers
November 10, 2024
October 20, 2024
September 18, 2024
July 16, 2024
September 21, 2023
September 6, 2023
January 6, 2023
July 16, 2022
June 22, 2022
June 2, 2022
May 24, 2022
December 7, 2021
November 18, 2021