Dynamic Activation

Dynamic activation (DA) in large language models (LLMs) aims to improve inference efficiency by selectively activating only necessary neurons, thereby reducing computational cost without significant performance loss. Current research focuses on developing training-free DA methods, exploring the underlying causes of LLM sparsity (like "massive over-activation"), and investigating optimal activation steering strategies for multi-property conditioning. These advancements hold significant promise for accelerating LLM inference and expanding their applicability, particularly in resource-constrained environments, while also informing a deeper understanding of neural network expressivity and efficiency.

Papers