Based Pruning

Based pruning is a model compression technique aiming to reduce the size and computational cost of machine learning models without significant performance degradation. Current research focuses on developing efficient pruning algorithms for various architectures, including boosted trees, convolutional neural networks (CNNs), vision transformers (ViTs), and mixture-of-experts (MoE) language models, often incorporating uncertainty quantification or explainability to guide the pruning process. This work is significant because it enables the deployment of large, powerful models on resource-constrained devices and improves the interpretability and efficiency of existing models, impacting both scientific understanding and practical applications.

Papers