Lossless Performance
Lossless performance in data compression aims to reduce data size without any information loss, enabling perfect reconstruction of the original data. Current research focuses on achieving this for large language models (LLMs) and images, employing techniques like weight-momentum joint shrinking for LLMs, mixed-precision quantization for expert switching frameworks, and novel depth-wise compression for key-value caches. These advancements are crucial for deploying large models efficiently, reducing storage needs, and improving the speed and scalability of applications ranging from natural language processing to image analysis and federated learning.
Papers
October 28, 2024
October 14, 2024
July 31, 2024
June 17, 2024
June 13, 2024
May 23, 2024
April 16, 2024
April 5, 2024
March 1, 2024
January 24, 2024
October 16, 2023
August 16, 2023
August 8, 2023
July 25, 2023