Activation Maximization

Activation maximization (AM) is a technique used to understand and interpret the internal workings of neural networks, primarily by identifying input patterns that maximally activate specific neurons or groups of neurons. Current research focuses on applying AM to diverse model architectures, including large language models (LLMs) and convolutional neural networks (CNNs), often employing gradient-based optimization or training-free methods to improve efficiency and interpretability. This work is significant for enhancing model transparency, improving the reliability of model explanations, and potentially addressing challenges like backdoor attacks and out-of-distribution generalization in various applications, from image analysis to time series prediction.

Papers