Concept Neuron

Concept neurons, neurons selectively activated by specific concepts regardless of sensory modality, are a focus of research exploring the parallels between neural network architectures and brain function. Current research investigates identifying and manipulating these neurons within large language and diffusion models, particularly focusing on techniques like gradient analysis and network pruning to control concept activation and generation. This work aims to improve model interpretability, control, and robustness, while also offering insights into the biological mechanisms of memory and concept representation in the brain.

Papers