Compressed Model

Compressed models aim to reduce the size and computational cost of large machine learning models, particularly deep learning models and large language models (LLMs), while preserving performance. Current research focuses on developing novel compression techniques, including pruning, quantization, low-rank decomposition, and the use of transformers and autoencoders, often tailored to specific applications or model architectures. These advancements are crucial for deploying sophisticated models on resource-constrained devices and improving the efficiency and sustainability of AI systems, impacting various fields from image processing and natural language processing to medical imaging and scientific computing.

Papers