Targeted Activation Penalty
Targeted activation penalty (TAP) research focuses on improving the robustness and interpretability of neural networks by manipulating neuron activations. Current work investigates how activation scaling, dropout, and other techniques can mitigate issues like spurious signal reliance, massive activations (excessively large activation values in specific dimensions), and task drift in large language models (LLMs) and other architectures, including convolutional neural networks (CNNs) and graph neural networks (GNNs). These efforts aim to enhance model generalization, safety, and explainability, leading to more reliable and trustworthy AI systems across various applications. The ultimate goal is to develop more robust and interpretable models by better understanding and controlling the internal workings of neural networks.
Papers
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
Rima Hazra, Sayan Layek, Somnath Banerjee, Soujanya Poria
On GNN explanability with activation rules
Luca Veyrin-Forrer, Ataollah Kamal, Stefan Duffner, Marc Plantevit, Céline Robardet
Three Decades of Activations: A Comprehensive Survey of 400 Activation Functions for Neural Networks
Vladimír Kunc, Jiří Kléma
Enhancing Sequential Model Performance with Squared Sigmoid TanH (SST) Activation Under Data Constraints
Barathi Subramanian, Rathinaraja Jeyaraj, Rakhmonov Akhrorjon Akhmadjon Ugli, Jeonghong Kim
Exact capacity of the \emph{wide} hidden layer treelike neural networks with generic activations
Mihailo Stojnic
Fixed width treelike neural networks capacity analysis -- generic activations
Mihailo Stojnic
A Sampling Theory Perspective on Activations for Implicit Neural Representations
Hemanth Saratchandran, Sameera Ramasinghe, Violetta Shevchenko, Alexander Long, Simon Lucey