Test Time Prompt Tuning

Test-time prompt tuning (TPT) adapts pre-trained vision-language models (like CLIP) to new, unseen data *during inference*, without requiring further training. Current research focuses on improving TPT's efficiency and effectiveness, exploring techniques like self-supervised learning, low-rank adaptation, and various data augmentation strategies to optimize prompts for individual test samples. This approach offers a significant advantage by enabling zero-shot generalization to new domains and tasks, enhancing the adaptability and practical applicability of large vision-language models in real-world scenarios.

Papers