BERT Pruning
BERT pruning aims to reduce the size and computational cost of BERT-based language models while preserving accuracy. Current research focuses on developing efficient pruning algorithms, such as gradual magnitude pruning, and optimizing the pruning process through techniques like knowledge distillation and task-adaptive pre-training, often targeting specific model components like embeddings. These efforts are driven by the need to deploy large language models on resource-constrained devices and improve the efficiency of training and inference, impacting both edge AI applications and federated learning scenarios.
Papers
December 21, 2023
June 9, 2023
May 3, 2023
September 26, 2022
June 21, 2022
May 26, 2022
April 10, 2022