Field Programmable Gate Array
Field-Programmable Gate Arrays (FPGAs) are reconfigurable hardware devices increasingly used to accelerate machine learning (ML) inference, particularly for resource-constrained applications like edge computing. Current research focuses on optimizing various ML model architectures, including transformers, convolutional neural networks (CNNs), and recurrent neural networks (RNNs), for efficient deployment on FPGAs, often employing techniques like quantization and model compression to reduce resource usage and latency. This work is significant because it enables the deployment of powerful ML models on low-power, embedded devices, impacting diverse fields from real-time image processing in robotics and scientific instrumentation to high-throughput data analysis in high-energy physics.
Papers
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs
Ehsan Kabir, Md. Arafat Kabir, Austin R.J. Downey, Jason D. Bakos, David Andrews, Miaoqing Huang
ProTEA: Programmable Transformer Encoder Acceleration on FPGA
Ehsan Kabir, Jason D. Bakos, David Andrews, Miaoqing Huang