Floating Point Format
Floating-point formats are numerical representations crucial for efficient deep learning, particularly in large language models (LLMs) and convolutional neural networks (CNNs). Current research focuses on developing novel low-bit (e.g., 4-bit, 8-bit) floating-point formats, including variations like block floating point (BFP) and tapered precision designs, to optimize memory usage, computational speed, and energy efficiency without sacrificing accuracy. This work is driven by the need to deploy increasingly large models on resource-constrained hardware, and findings demonstrate that carefully designed low-precision floating-point formats can often outperform their integer counterparts in various applications.
Papers
September 25, 2024
March 29, 2024
November 21, 2023
July 19, 2023
May 21, 2023
December 13, 2022
October 11, 2022