Paper ID: 2410.08371 • Published Oct 10, 2024
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
By merging models, AI systems can combine the distinct strengths of separate
language models, achieving a balance between multiple capabilities without
requiring substantial retraining. However, the integration process can be
intricate due to differences in training methods and fine-tuning, typically
necessitating specialized knowledge and repeated refinement. This paper
explores model merging techniques across a spectrum of complexity, examining
where automated methods like evolutionary strategies stand compared to
hyperparameter-driven approaches such as DARE, TIES-Merging and simpler methods
like Model Soups. In addition, we introduce Differentiable Adaptive Merging
(DAM), an efficient, adaptive merging approach as an alternative to
evolutionary merging that optimizes model integration through scaling
coefficients, minimizing computational demands. Our findings reveal that even
simple averaging methods, like Model Soups, perform competitively when model
similarity is high, underscoring each technique's unique strengths and
limitations. We open-sourced DAM, including the implementation code and
experiment pipeline, on GitHub: this https URL