Paper ID: 2411.18322 • Published Nov 27, 2024
Mixture of Experts in Image Classification: What's the Sweet Spot?
Mathurin Videau, Alessandro Leite, Marc Schoenauer, Olivier Teytaud
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Mixture-of-Experts (MoE) models have shown promising potential for
parameter-efficient scaling across various domains. However, the implementation
in computer vision remains limited, and often requires large-scale datasets
comprising billions of samples. In this study, we investigate the integration
of MoE within computer vision models and explore various MoE configurations on
open datasets. When introducing MoE layers in image classification, the best
results are obtained for models with a moderate number of activated parameters
per sample. However, such improvements gradually vanish when the number of
parameters per sample increases.