Filter Pruning

Filter pruning is a model compression technique aiming to reduce the computational cost and memory footprint of convolutional neural networks (CNNs) without significant accuracy loss. Current research focuses on developing efficient algorithms to identify and remove less important filters, often employing iterative pruning strategies combined with fine-tuning and exploring various importance metrics (e.g., norm-based, similarity-based, attention-based). This work is significant because it enables the deployment of deep learning models on resource-constrained devices like mobile phones and embedded systems, impacting applications ranging from object detection and face recognition to medical image analysis and real-time UAV tracking.

Papers