Model Expansion

Model expansion focuses on efficiently scaling up existing machine learning models, particularly deep neural networks and transformers, to improve performance without requiring complete retraining from scratch. Current research explores various expansion techniques, including graph-based methods, iterative local expansions, and function-preserving transformations, often applied to enhance generative models, continual learning systems, and information retrieval. These advancements aim to reduce the substantial computational costs associated with training increasingly large models, thereby accelerating progress in AI and enabling the development of more powerful and efficient systems for diverse applications.

Papers