Layer Precision
Layer precision, focusing on optimizing the numerical precision of individual layers within neural networks, aims to improve energy efficiency and computational speed without significant accuracy loss. Current research explores techniques like layer pruning, mixed-precision quantization (using algorithms such as EAGL and ALPS), and co-optimization of neural architecture and hardware parameters (e.g., in memristive crossbars), often applied to models such as ResNet, BERT, and VGG. These advancements are significant for deploying deep learning models on resource-constrained devices and accelerating inference, impacting both the efficiency of large-scale AI systems and the accessibility of AI to low-resource settings.
Papers
September 21, 2024
March 30, 2023
January 30, 2023
August 11, 2022