Computational Overhead

Computational overhead, the excessive resource consumption of algorithms and models, is a major bottleneck in deploying advanced AI systems, particularly large language models and diffusion transformers. Current research focuses on mitigating this overhead through techniques like caching intermediate results (e.g., in diffusion models), developing parameter-efficient fine-tuning methods, and designing novel optimization algorithms that reduce the number of computations needed for training and inference. Reducing computational overhead is crucial for enabling real-time applications on resource-constrained devices and making large-scale AI more accessible and sustainable.

Papers

October 17, 2024

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Ang Li, Dong Yu
Transformer Megatron Decepticons Model Training Large Depth Computation Method Effective Approach Adaptive Depth Computational Overhead

October 15, 2024

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang, Allan Zhou, Zhili Feng, Sadhika Malladi, J. Zico Kolter
Scaling Law Computational Efficiency Adaptive Optimization Data Mixture Proxy Model Computational Overhead

July 1, 2024

FORA: Fast-Forward Caching in Diffusion Transformer Acceleration
Pratheba Selvaraju, Tianyu Ding, Tianyi Chen, Ilya Zharkov, Luming Liang
Diffusion Transformer Transformer Based Diffusion Model Forward Caching Computational Overhead

May 7, 2024

Tiny Deep Ensemble: Uncertainty Estimation in Edge AI Accelerators via Ensembling Normalization Layers with Shared Weights
Soyed Tuhin Ahmed, Michael Hefenbrock, Mehdi B. Tahoori
Neural Network Uncertainty Estimation Deep Ensemble Model Ensembling Ensemble Approach Uncertainty Estimation Method AI Accelerator Computational Overhead

March 21, 2024

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, Sai Qian Zhang
Comprehensive Survey Parameter Efficient Fine Tuning Large Model Large Pre Trained Model Computational Cost Computational Overhead

January 22, 2024

Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker, Frederick Altrock, Benjamin Risse
Optimization Purpose Sharpness Aware Minimization Nesterov Momentum Gradient Ascent Computational Overhead

December 10, 2023

Diffusion for Natural Image Matting
Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, Humphrey Shi
Diffusion Explainer Encoder Decoder Image Matting Computational Overhead

October 17, 2023

Heterogenous Memory Augmented Neural Networks
Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu
Semi Parametric Standard Deep Computational Overhead Heterogeneous Memory

June 28, 2022

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
Mantas Mazeika, Bo Li, David Forsyth
Adversary Agent Model Stealing Attack Surrogate Neural Network Computational Overhead Prior Defense