Low Rank
Low-rank techniques aim to reduce the computational cost and memory requirements of large-scale machine learning models by representing high-dimensional data or model parameters using lower-dimensional structures. Current research focuses on applying low-rank methods to improve the efficiency of large language models (LLMs) and other deep learning architectures, often through techniques like low-rank adaptation (LoRA) and its variants, as well as matrix and tensor factorization. These advancements are significant because they enable the training and deployment of larger and more powerful models on resource-constrained devices, improving performance in various applications such as natural language processing, computer vision, and recommendation systems. Furthermore, theoretical work is exploring the inherent low-rank properties of trained models to better understand and optimize these methods.
Papers
Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images
Xinyang Pu, Feng Xu
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining
Andi Han, Jiaxiang Li, Wei Huang, Mingyi Hong, Akiko Takeda, Pratik Jawanpuria, Bamdev Mishra
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari, Amir Yazdanbakhsh, Zhao Zhang, Maryam Mehri Dehnavi
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters
Xinyu Zhou, Boris Knyazev, Alexia Jolicoeur-Martineau, Jie Fu