Pruning Mask

Pruning masks are binary matrices used to selectively remove less important weights or neurons from neural networks, aiming to reduce model size and computational cost without significant performance loss. Current research focuses on developing efficient algorithms for generating these masks, particularly for large language models (LLMs) and convolutional neural networks (CNNs), often employing optimization techniques or leveraging the strong lottery ticket hypothesis. These advancements are significant because they enable the deployment of smaller, faster, and more energy-efficient deep learning models across various applications, including federated learning and intellectual property protection.

Papers