Online Merging

Online merging focuses on combining multiple trained neural network models into a single, more powerful model, aiming to reduce resource consumption, improve generalization, and streamline model development. Current research emphasizes scaling merging techniques to larger models (e.g., transformers with billions of parameters) and exploring various merging methods, including averaging, least squares optimization, and specialized techniques like Foldable SuperNets. This area is significant because efficient model merging can improve the performance and cost-effectiveness of machine learning systems across diverse applications, from image processing and natural language processing to autonomous driving and program repair.

Papers