Low Precision
Low precision computing aims to improve the efficiency of machine learning and other computationally intensive tasks by reducing the number of bits used to represent numbers, thereby decreasing memory usage, energy consumption, and computation time. Current research focuses on developing novel algorithms and hardware architectures to mitigate the accuracy loss inherent in low-precision computations, including techniques like mixed-precision strategies, optimized matrix multiplication methods, and the use of alternative number systems (e.g., Posits). These advancements are crucial for deploying large language models and other complex AI applications on resource-constrained devices, such as mobile phones and embedded systems, and for accelerating high-performance computing in general.
Papers
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu, Gaurav Gupta, Karthick Gopalswamy, Amith Mamidala, Hao Zhou, Jeffrey Huynh, Youngsuk Park, Ron Diamant, Anoop Deoras, Luke Huan
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel, Yuzong Chen, Bahaa Kotb, Sushma Prasad, Gang Wu, Sheng Li, Mohamed S. Abdelfattah, Zhiru Zhang