Model Fusion

Model fusion aims to combine the strengths of multiple machine learning models, improving performance and robustness beyond what any single model can achieve. Current research focuses on efficient fusion techniques for large language models (LLMs) and other deep learning architectures, exploring methods like weight averaging, optimal transport, and mixture-of-experts models to address challenges such as parameter interference and computational cost. These advancements are significant for improving the accuracy and reliability of AI systems across diverse applications, from natural language processing and computer vision to personalized medicine and federated learning.

Papers