Fast Feedforward Network
Fast feedforward networks (FFNs) are a novel neural network architecture designed to drastically reduce computational cost during inference by selectively activating only a small subset of neurons based on the input. Current research focuses on improving FFN efficiency and accuracy through techniques like load balancing and incorporating "master leaf" nodes, inspired by Mixture of Experts models. This approach offers significant speedups (up to several orders of magnitude) compared to traditional feedforward networks, impacting applications ranging from language modeling to image processing by enabling the use of larger models and faster processing on resource-constrained devices.
Papers
May 27, 2024
November 15, 2023
September 15, 2023
August 28, 2023