Stochastic Rounding

Stochastic rounding, a technique that introduces randomness into numerical rounding, is being actively researched for its ability to improve the efficiency and accuracy of computations, particularly in machine learning. Current research focuses on understanding its impact on the convergence of optimization algorithms, especially gradient descent, and its application in post-training quantization of deep neural networks, including large language models. This work demonstrates that stochastic rounding can implicitly regularize matrices, mitigate the vanishing gradient problem, and enhance the performance of quantized models, leading to more efficient and robust machine learning systems. The findings are relevant to both theoretical computer science and practical applications involving resource-constrained devices and large-scale model deployment.

Papers

December 6, 2024

Direct Quantized Training of Language Models with Stochastic Rounding
Kaiyan Zhao, Tsuguchika Tabaru, Kenichi Kobayashi, Takumi Honda, Masafumi Yamazaki, Yoshimasa Tsuruoka
Language Model Weight Matrix Ternary Neural Network Bit Weight Straight Through Estimator Stochastic Rounding

March 18, 2024

Stochastic Rounding Implicitly Regularizes Tall-and-Thin Matrices
Gregory Dexter, Christos Boutsikas, Linkai Ma, Ilse C.F. Ipsen, Petros Drineas
Stochastic Way Random Matrix External Keyword Matrix Precision Matrix Stochastic Rounding

February 29, 2024

Team Formation amidst Conflicts
Iasonas Nikolaou, Evimaria Terzi
Approximation Algorithm Real World Scenario Conflict Classification Task Assignment Team Formation Stochastic Rounding

June 1, 2023

FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
Jung Hyun Lee, Jeonghoon Kim, Se Jung Kwon, Dongsoo Lee
Post Training Quantization Quantization Aware Training Data Divide Different Quantization Stochastic Rounding

January 23, 2023

On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality
Lu Xia, Michiel E. Hochstenbach, Stefano Massei
Gradient Descent Early Stage Convergence Error Feedback Fixed Point Stochastic Rounding

February 24, 2022

On the influence of stochastic roundoff errors and their bias on the convergence of the gradient descent method with low-precision floating-point computation
Lu Xia, Stefano Massei, Michiel E. Hochstenbach, Barry Koren
Gradient Descent Absolute Stance Bias Early Stage Convergence Stochastic Rounding