Binarization Method
Binarization methods aim to drastically reduce the computational cost and memory footprint of neural networks by representing weights and/or activations using only one bit. Current research focuses on improving the accuracy of binarized models, particularly for large language models (LLMs) and vision transformers (ViTs), through techniques like alternating refined binarization, learnable binarization during training, and incorporating structural sparsity. These advancements are significant because they enable the deployment of powerful deep learning models on resource-constrained devices, impacting diverse fields from natural language processing and computer vision to mobile and embedded systems.
Papers
Partial Binarization of Neural Networks for Budget-Aware Efficient Learning
Udbhav Bamba, Neeraj Anand, Saksham Aggarwal, Dilip K. Prasad, Deepak K. Gupta
PatchRefineNet: Improving Binary Segmentation by Incorporating Signals from Optimal Patch-wise Binarization
Savinay Nagendra, Chaopeng Shen, Daniel Kifer