Transformer Feed Forward Layer

Transformer feed-forward layers are a crucial component of many deep learning models, serving as a powerful mechanism for processing and transforming information. Current research focuses on improving their efficiency, interpretability, and generalization capabilities, exploring techniques like adaptive gradient estimation, structured pruning, and novel activation functions to optimize performance and reduce computational costs in large language models and other applications. These efforts aim to enhance the understanding of these layers' internal workings, leading to more efficient and effective deep learning architectures across various domains, including natural language processing and image recognition. Furthermore, research is investigating the integration of feedforward layers with recurrent networks and exploring their mathematical properties, such as transitions to linearity under certain conditions.

Papers

January 12, 2025

F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting
Yuxin Wang, Qianyi Wu, Dan Xu
Supervised ImageNet Gaussian Splatting Faithful Generation Monocular Video 3d Representation Transformer Feed Forward Layer Video Prior Consistent 3D

November 7, 2024

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
Yuedong Chen, Chuanxia Zheng, Haofei Xu, Bohan Zhuang, Andrea Vedaldi, Tat-Jen Cham, Jianfei Cai
Video Generation Many Sparse Scene Synthesis Sparse Input View Transformer Feed Forward Layer

September 22, 2024

Adaptive Feedforward Gradient Estimation in Neural ODEs
Jaouad Dabounou
Back Propagation Neural ODE Neural Ordinary Differential Equation Transformer Feed Forward Layer

September 19, 2024

Universal approximation theorem for neural networks with inputs from a topological vector space
Vugar Ismailov
Neural Network Universal Approximation Rich Input Vector Space Transformer Feed Forward Layer

April 1, 2024

Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation
Harry Dong, Beidi Chen, Yuejie Chi
Neural Network Mixture Component Expert Knowledge Transformer Based Large Language Model LLM Generation Transformer Feed Forward Layer

February 19, 2024

Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers
Zihan Qiu, Zeyu Huang, Youcheng Huang, Jie Fu
Large Language Model Neural Network Empirical Study Feed Forward Layer Transformer Feed Forward Layer Key Value Memory Fine Tuning Task

February 4, 2024

Unification of Symmetries Inside Neural Networks: Transformer, Feedforward and Neural ODE
Koji Hashimoto, Yuji Hirono, Akiyoshi Sannai
Neural Network Deep Learning Model Neural ODE Approximate Symmetry Anti Unification Feature Redundancy Transformer Feed Forward Layer Time Continuous Diffeomorphisms

August 28, 2023

Fast Feedforward Networks
Peter Belcak, Roger Wattenhofer
Neural Network Vision Transformer Mixture of Expert Inference Cost Transformer Feed Forward Layer Fast Feedforward Network

August 10, 2023

Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions
Chinmay Rane, Kanishka Tyagi, Michael Manry
Convolutional Neural Network ReLU Activation Performance Optimization Transformer Feed Forward Layer Piecewise Polynomial

February 16, 2023

Adaptive Axonal Delays in feedforward spiking neural networks for accurate spoken word recognition
Pengfei Sun, Ehsan Eqlimi, Yansong Chua, Paul Devos, Dick Botteldooren
Neural Network Transformer Feed Forward Layer Synaptic Delay Spike Encoding Word Recognition

December 19, 2022

Fixed-Weight Difference Target Propagation
Tatsukichi Shibuya, Nakamasa Inoue, Rei Kawakami, Ikuro Sato
Transformer Feed Forward Layer Feedback Network

November 8, 2022

Expressing linear equality constraints in feedforward neural networks
Anand Rangarajan, Pan He, Jaemoon Lee, Tania Banerjee, Sanjay Ranka
Neural Network Transformer Feed Forward Layer Linear Constraint Lagrange Multiplier Equality Constraint

August 5, 2022

Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter
Aleksandar Stanić, Yujin Tang, David Ha, Jürgen Schmidhuber
LeArning Abstract Reinforcement Learning Agent Object Centric Transformer Feed Forward Layer

June 30, 2022

Brain-like combination of feedforward and recurrent network components achieves prototype extraction and robust pattern recognition
Naresh Balaji Ravichandran, Anders Lansner, Pawel Herman
Associative Memory Attractor Network Transformer Feed Forward Layer Robust Recognition Brain Like Recurrent Connection

May 24, 2022

Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu, Chaoyue Liu, Mikhail Belkin
Neural Network Neural Tangent Kernel Directed Acyclic Graph Transformer Feed Forward Layer Late Time Transition

April 20, 2022

Noise mitigation strategies in physical feedforward neural networks
Nadezhda Semenova, Daniel Brunner
Neural Network Correlated Noise Single Neuron Transformer Feed Forward Layer Noise Tolerant Network Mitigating Noise

March 30, 2022

Convergence of gradient descent for deep neural networks
Sourav Chatterjee
Neural Network Deep Neural Network Gradient Descent Early Stage Convergence Transformer Feed Forward Layer Random Initialization

March 28, 2022

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
Neural Network Transformer Model Transformer Based Language Model Concept Identification Token Representation Language Space Transformer Feed Forward Layer

February 24, 2022

Demonstrating BrainScaleS-2 Inter-Chip Pulse-Communication using EXTOLL
Tobias Thommes, Sven Bordukat, Andreas Grübl, Vitali Karasenko, Eric Müller, Johannes Schemmel
Neuromorphic Computing Textual Demonstration Chip Design Transformer Feed Forward Layer Chip Placement

February 17, 2022

Pricing options on flow forwards by neural networks in Hilbert space
Fred Espen Benth, Nils Detering, Luca Galimberti
Neural Network Flow Mood Classical Neural Network Hilbert Space Option Pricing Transformer Feed Forward Layer

Transformer Feed Forward Layer

Papers

F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

Adaptive Feedforward Gradient Estimation in Neural ODEs

Universal approximation theorem for neural networks with inputs from a topological vector space

Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation

Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers

Unification of Symmetries Inside Neural Networks: Transformer, Feedforward and Neural ODE

Fast Feedforward Networks

Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions

Adaptive Axonal Delays in feedforward spiking neural networks for accurate spoken word recognition

Fixed-Weight Difference Target Propagation

Expressing linear equality constraints in feedforward neural networks

Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter

Brain-like combination of feedforward and recurrent network components achieves prototype extraction and robust pattern recognition

Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

Noise mitigation strategies in physical feedforward neural networks

Convergence of gradient descent for deep neural networks

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Demonstrating BrainScaleS-2 Inter-Chip Pulse-Communication using EXTOLL

Pricing options on flow forwards by neural networks in Hilbert space