Butterfly Matrix

Butterfly matrices are structured, sparse matrices used to create more efficient and parameter-light neural network architectures. Current research focuses on applying these matrices within various model types, including transformers and normalizing flows, to improve training speed and reduce memory requirements for large language models and other deep learning applications. This approach addresses the high computational cost of training large models by leveraging the inherent structure of butterfly matrices to achieve comparable performance with fewer parameters and faster training times. The resulting efficiency gains have significant implications for both the scalability of deep learning research and the deployment of large models in resource-constrained environments.

Papers

November 10, 2023

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf
Fine Tuning Orthogonal Fine Tuning Butterfly Matrix Orthogonal Parameterization

October 18, 2023

Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture
Daniel Y. Fu, Simran Arora, Jessica Grogan, Isys Johnson, Sabri Eyuboglu, Armin W. Thomas, Benjamin Spector, Michael Poli, Atri Rudra, Christopher Ré
Architecture Design Major Challenge Bottleneck GPT Model Simple Alternating Mixer Structured Matrix Butterfly Matrix

September 16, 2023

Reducing Memory Requirements for the IPU using Butterfly Factorizations
S. -Kazem Shekofteh, Christian Alles, Holger Fröning
High Performance Computing Processing Unit Butterfly Matrix

September 28, 2022

ButterflyFlow: Building Invertible Layers with Butterfly Matrices
Chenlin Meng, Linqi Zhou, Kristy Choi, Tri Dao, Stefano Ermon
Normalizing Flow Reversible Architecture Butterfly Matrix

April 1, 2022

Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Tri Dao, Beidi Chen, Nimit Sohoni, Arjun Desai, Michael Poli, Jessica Grogan, Alexander Liu, Aniruddh Rao, Atri Rudra, Christopher Ré
Fine Tuning High Efficiency Weight Matrix Structured Matrix Accurate Training Sparse to Sparse Training Butterfly Matrix

March 25, 2022

Deformable Butterfly: A Highly Structured and Sparse Linear Transform
Rui Lin, Jie Ran, King Hung Chiu, Graziano Chesi, Ngai Wong
Convolutional Layer Early Layer Butterfly Matrix

November 30, 2021

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré
Neural Network Model Sparse Model Sparse Training Sparse Matrix Sparse Mask Monarch Butterfly Butterfly Matrix