Binarization Method

Binarization methods aim to drastically reduce the computational cost and memory footprint of neural networks by representing weights and/or activations using only one bit. Current research focuses on improving the accuracy of binarized models, particularly for large language models (LLMs) and vision transformers (ViTs), through techniques like alternating refined binarization, learnable binarization during training, and incorporating structural sparsity. These advancements are significant because they enable the deployment of powerful deep learning models on resource-constrained devices, impacting diverse fields from natural language processing and computer vision to mobile and embedded systems.

Papers