Parameter Sharing
Parameter sharing in machine learning aims to reduce computational costs and improve efficiency by reusing model parameters across different tasks, layers, or agents. Current research focuses on developing novel parameter-sharing strategies within various architectures, including transformers, convolutional neural networks, and multi-agent reinforcement learning models, often incorporating techniques like mixture-of-experts and adaptive gating mechanisms to optimize resource allocation and performance. This research is significant because efficient parameter sharing is crucial for deploying large-scale models in resource-constrained environments and for improving the scalability and training efficiency of complex systems.
Papers
Efficient model compression with Random Operation Access Specific Tile (ROAST) hashing
Aditya Desai, Keren Zhou, Anshumali Shrivastava
SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton, Mohammad Rastegari