Compressed Model
Compressed models aim to reduce the size and computational cost of large machine learning models, particularly deep learning models and large language models (LLMs), while preserving performance. Current research focuses on developing novel compression techniques, including pruning, quantization, low-rank decomposition, and the use of transformers and autoencoders, often tailored to specific applications or model architectures. These advancements are crucial for deploying sophisticated models on resource-constrained devices and improving the efficiency and sustainability of AI systems, impacting various fields from image processing and natural language processing to medical imaging and scientific computing.
Papers
March 13, 2023
December 21, 2022
November 2, 2022
April 22, 2022
March 18, 2022
January 21, 2022