Shot Pruning

Shot pruning aims to efficiently reduce the size and computational cost of large neural networks, particularly in large language models (LLMs) and diffusion models, without significant performance loss. Current research focuses on developing one-shot pruning methods that identify and remove unimportant weights or layers in a single pass, often employing novel pruning criteria based on weight magnitudes, gradients, or optimization-based approaches like bi-level optimization. These techniques offer significant potential for deploying large models on resource-constrained devices and accelerating inference times, impacting both the efficiency of AI research and the accessibility of advanced AI applications.

Papers

November 27, 2024

Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
Ryan Lucas, Rahul Mazumder
Edge Pruning Second Order Shot Pruning

November 26, 2024

Scalable iterative pruning of large language and vision models using block coordinate descent
Gili Rosenberg, J. Kyle Brubaker, Martin J. A. Schuetz, Elton Yechao Zhu, Serdar Kadıoğlu, Sima E. Borujeni, Helmut G. Katzgraber
Vision Model Large Language Iterative Pruning Block Coordinate Descent Shot Pruning Block Wise Pruning

October 13, 2024

Self-Data Distillation for Recovering Quality in Pruned Large Language Models
Vithursan Thangarasa, Ganesh Venkatesh, Mike Lasby, Nish Sinnadurai, Sean Lie
Large Language Model Full Model Quality Issue Structured Pruning Shot Pruning

September 29, 2024

Investigating the Effect of Network Pruning on Performance and Interpretability
Jonathan von Rad, Florian Seuffert
Inherent Interpretability Mixed Effect System Performance Structured Pruning Iterative Pruning Shot Pruning

August 20, 2024

August 8, 2024

Confident magnitude-based neural network pruning
Joaquin Alvarez
Uncertainty Quantification Pruning Method High Confidence Based Pruning Free Uncertainty Shot Pruning

July 26, 2024

Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li, Yijun Dong, Qi Lei
Large Language Model Structured Pruning Transformer Based Large Language Model Greedy Algorithm Shot Pruning Depth Pruning

June 11, 2024

MoreauPruner: Robust Pruning of Large Language Models against Weight Perturbations
Zixiao Wang, Jingwei Zhang, Wenqian Zhao, Farzan Farnia, Bei Yu
Shot Learning Model Pruning Weight Perturbation Shot Pruning Moreau Envelope Robust Pruning

April 17, 2024

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models
Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu
U Net Mutual Distillation Stable Diffusion Model Feature Distillation Digital Computer Layer Segmentation Layer Pruning Shot Pruning

October 8, 2023

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models
Song Guo, Jiahang Xu, Li Lyna Zhang, Mao Yang
Large Language Model Structured Pruning LLM Compression Shot Pruning

July 10, 2023

One-Shot Pruning for Fast-adapting Pre-trained Models on Devices
Haiyan Zhao, Guodong Long
Convolutional Neural Network Pre Trained Model Edge Pruning Model Pruning Large Scale Pre Trained Model Smart Device Shot Pruning

October 8, 2022

Advancing Model Pruning via Bi-level Optimization
Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu
Pruning Method Model Pruning Bi Level Optimization Magnitude Pruning Pruning Strategy Shot Pruning

February 24, 2022

The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione
Deep Learning Model Subgraph Structure Lottery Ticket Efficient Computation Shot Pruning

December 31, 2021

Single-Shot Pruning for Offline Reinforcement Learning
Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Doina Precup
Deep Reinforcement Learning Offline Reinforcement Learning Offline Reinforcement Learning Algorithm Shot Pruning

Shot Pruning

Papers

Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework

Scalable iterative pruning of large language and vision models using block coordinate descent

Self-Data Distillation for Recovering Quality in Pruned Large Language Models

Investigating the Effect of Network Pruning on Performance and Interpretability

LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models

Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism

Confident magnitude-based neural network pruning

Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining

MoreauPruner: Robust Pruning of Large Language Models against Weight Perturbations

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

One-Shot Pruning for Fast-adapting Pre-trained Models on Devices

Advancing Model Pruning via Bi-level Optimization

The rise of the lottery heroes: why zero-shot pruning is hard

Single-Shot Pruning for Offline Reinforcement Learning