Weight Matrix
Weight matrices, the core components of neural networks, are the subject of intense research focused on improving efficiency, generalization, and interpretability. Current efforts explore low-rank approximations, structured matrices (e.g., Monarch, Block Tensor-Train), and novel training methods like weight decay and parameter-efficient fine-tuning (PEFT) techniques such as LoRA, to optimize their structure and reduce computational costs. These advancements are crucial for scaling up deep learning models, enabling their application to larger datasets and more complex tasks, and enhancing our understanding of how these models learn and generalize.
Papers
On Generalization Bounds for Neural Networks with Low Rank Layers
Andrea Pinto, Akshay Rangamani, Tomaso Poggio
Dyson Brownian motion and random matrix dynamics of weight matrices during learning
Gert Aarts (Swansea University), Ouraman Hajizadeh (Graz), Biagio Lucini (Swansea University), Chanju Park (Swansea University)