DNN Compression

Deep neural network (DNN) compression aims to reduce the computational cost and memory footprint of DNNs without significantly sacrificing accuracy. Current research focuses on techniques like pruning (removing less important connections), quantization (reducing the precision of weights and activations), and knowledge distillation (transferring knowledge from a larger model to a smaller one), often applied in conjunction with reinforcement learning or Bayesian optimization to find optimal compression strategies. These methods are crucial for deploying DNNs on resource-constrained devices like embedded systems and mobile phones, enabling broader applications of AI in various fields. Furthermore, research is exploring new evaluation metrics and addressing challenges like backdoor attacks on compressed models.

Papers