Neuron Attribution

Neuron attribution aims to understand which neurons within a neural network (biological or artificial) are most responsible for a given output or behavior. Current research focuses on developing methods to identify these influential neurons, exploring their roles across different tasks and languages (especially in large language models), and designing novel neuron architectures and training algorithms to improve model efficiency and interpretability. This work is crucial for enhancing the explainability of complex neural systems, improving model design, and potentially leading to advancements in fields like neuroscience, AI, and secure communication.

Papers