AI Accelerator
AI accelerators are specialized hardware designed to significantly speed up and reduce the energy consumption of artificial intelligence computations. Current research focuses on optimizing performance for large language models and other deep neural networks, including exploring novel architectures like compute-in-memory and employing techniques such as weight sparsity and model compression to improve efficiency. This field is crucial for deploying AI in resource-constrained environments like mobile devices and edge computing, and for enabling the development and application of increasingly complex AI models. Improved efficiency and reduced latency are key objectives, driving innovation in both hardware design and software optimization strategies.
Papers
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators
Paolo D'Alberto, Taehee Jeong, Akshai Jain, Shreyas Manjunath, Mrinal Sarmah, Samuel Hsu, Yaswanth Raparti, Nitesh Pipralia
Inference Optimization of Foundation Models on AI Accelerators
Youngsuk Park, Kailash Budhathoki, Liangfu Chen, Jonas Kübler, Jiaji Huang, Matthäus Kleindessner, Jun Huan, Volkan Cevher, Yida Wang, George Karypis