Task Specific Structured Pruning
Task-specific structured pruning aims to reduce the size and computational cost of large language models (LLMs) and other deep learning models, such as those used in speech recognition and vision-language tasks, without significant performance loss. Current research focuses on developing efficient algorithms that prune model components like attention heads, neurons, or entire layers, often incorporating techniques like knowledge distillation or sparse regularization to mitigate performance degradation. These advancements are crucial for deploying large models on resource-constrained devices and reducing the environmental impact of training and inference, impacting both research efficiency and practical applications.
Papers
November 19, 2024
September 25, 2024
August 20, 2024
August 10, 2024
July 12, 2024
May 29, 2024
May 28, 2024
April 8, 2024
September 22, 2023
May 28, 2023
February 27, 2023
December 15, 2022
November 10, 2022
September 13, 2022
July 16, 2022