Paper ID: 2412.16956 • Published Dec 22, 2024
Semantic Hierarchical Prompt Tuning for Parameter-Efficient Fine-Tuning
Haowei Zhu, Fangyuan Zhang, Rui Qin, Tianxiang Pan, Junhai Yong, Bin Wang
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
As the scale of vision models continues to grow, Visual Prompt Tuning (VPT)
has emerged as a parameter-efficient transfer learning technique, noted for its
superior performance compared to full fine-tuning. However, indiscriminately
applying prompts to every layer without considering their inherent
correlations, can cause significant disturbances, leading to suboptimal
transferability. Additionally, VPT disrupts the original self-attention
structure, affecting the aggregation of visual features, and lacks a mechanism
for explicitly mining discriminative visual features, which are crucial for
classification. To address these issues, we propose a Semantic Hierarchical
Prompt (SHIP) fine-tuning strategy. We adaptively construct semantic
hierarchies and use semantic-independent and semantic-shared prompts to learn
hierarchical representations. We also integrate attribute prompts and a prompt
matching loss to enhance feature discrimination and employ decoupled attention
for robustness and reduced inference costs. SHIP significantly improves
performance, achieving a 4.9% gain in accuracy over VPT with a ViT-B/16
backbone on VTAB-1k tasks. Our code is available at
this https URL