Efficient Expert
"Efficient Expert" research focuses on optimizing complex models by reducing computational costs without sacrificing performance. Current efforts concentrate on pruning large models like Mixture-of-Experts (MoE) and diffusion models into smaller, specialized "expert" sub-networks, often employing techniques like evolutionary algorithms or hierarchical feature selection to achieve this. These advancements are significant for deploying large language models and generative AI on resource-constrained devices, improving efficiency in various applications ranging from image generation to natural language processing. The ultimate goal is to enable the practical deployment of powerful models across a wider range of hardware and applications.
Papers
December 18, 2024
September 23, 2024
July 1, 2024
March 8, 2024
August 23, 2022
August 5, 2022