Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models
Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yan Xu
GE2E-KWS: Generalized End-to-End Training and Evaluation for Zero-shot Keyword Spotting
Pai Zhu, Jacob W. Bartel, Dhruuv Agarwal, Kurt Partridge, Hyun Jin Park, Quan Wang
CLIPtortionist: Zero-shot Text-driven Deformation for Manufactured 3D Shapes
Xianghao Xu, Srinath Sridhar, Daniel Ritchie
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification
Ashish Seth, Ramaneswaran Selvakumar, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha
Are LLMs Good Zero-Shot Fallacy Classifiers?
Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff
ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral
Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Shailaja Keyur Sampat, Maitreya Patel, Yezhou Yang, Chitta Baral
LLM Confidence Evaluation Measures in Zero-Shot CSS Classification
David Farr, Iain Cruickshank, Nico Manzonelli, Nicholas Clark, Kate Starbird, Jevin West
Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge
Fawaz Sammani, Nikos Deligiannis
Towards Zero-Shot Camera Trap Image Categorization
Jiří Vyskočil, Lukas Picek
Towards Graph Foundation Models: The Perspective of Zero-shot Reasoning on Knowledge Graphs
Kai Wang, Siqiang Luo
Evaluating Cascaded Methods of Vision-Language Models for Zero-Shot Detection and Association of Hardhats for Increased Construction Safety
Lucas Choi, Ross Greer
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Yilun Hao, Yang Zhang, Chuchu Fan
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat, Oussama Zekri, Albert Thomas, Giuseppe Paolo, Maurizio Filippone, Ievgen Redko, Balázs Kégl
Tree of Attributes Prompt Learning for Vision-Language Models
Tong Ding, Wanhua Li, Zhongqi Miao, Hanspeter Pfister
DMDSpeech: Distilled Diffusion Model Surpassing The Teacher in Zero-shot Speech Synthesis via Direct Metric Optimization
Yingahao Aaron Li, Rithesh Kumar, Zeyu Jin
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu, Zhengpu Wang, Mengxian Hu, Ronghao Dang, Xiao Lin, Xun Zhou, Chengju Liu, Qijun Chen