Low Rank
Low-rank techniques aim to reduce the computational cost and memory requirements of large-scale machine learning models by representing high-dimensional data or model parameters using lower-dimensional structures. Current research focuses on applying low-rank methods to improve the efficiency of large language models (LLMs) and other deep learning architectures, often through techniques like low-rank adaptation (LoRA) and its variants, as well as matrix and tensor factorization. These advancements are significant because they enable the training and deployment of larger and more powerful models on resource-constrained devices, improving performance in various applications such as natural language processing, computer vision, and recommendation systems. Furthermore, theoretical work is exploring the inherent low-rank properties of trained models to better understand and optimize these methods.
Papers
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari, Amir Yazdanbakhsh, Zhao Zhang, Maryam Mehri Dehnavi
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters
Xinyu Zhou, Boris Knyazev, Alexia Jolicoeur-Martineau, Jie Fu