Paper ID: 2408.11449 • Published Aug 21, 2024
Enabling Small Models for Zero-Shot Selection and Reuse through Model Label Learning
Jia Zhang, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Vision-language models (VLMs) like CLIP have demonstrated impressive
zero-shot ability in image classification tasks by aligning text and images but
suffer inferior performance compared with task-specific expert models. On the
contrary, expert models excel in their specialized domains but lack zero-shot
ability for new tasks. How to obtain both the high performance of expert models
and zero-shot ability is an important research direction. In this paper, we
attempt to demonstrate that by constructing a model hub and aligning models
with their functionalities using model labels, new tasks can be solved in a
zero-shot manner by effectively selecting and reusing models in the hub. We
introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap
between models and their functionalities through a Semantic Directed Acyclic
Graph (SDAG) and leverages an algorithm, Classification Head Combination
Optimization (CHCO), to select capable models for new tasks. Compared with the
foundation model paradigm, it is less costly and more scalable, i.e., the
zero-shot ability grows with the sizes of the model hub. Experiments on seven
real-world datasets validate the effectiveness and efficiency of MLL,
demonstrating that expert models can be effectively reused for zero-shot tasks.
Our code will be released publicly.