Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-shot Anomaly Detection
Xuhai Chen, Jiangning Zhang, Guanzhong Tian, Haoyang He, Wuhao Zhang, Yabiao Wang, Chengjie Wang, Yong Liu
Instructive Decoding: Instruction-Tuned Large Language Models are Self-Refiner from Noisy Instructions
Taehyeon Kim, Joonkee Kim, Gihun Lee, Se-Young Yun
GG-LLM: Geometrically Grounding Large Language Models for Zero-shot Human Activity Forecasting in Human-Aware Task Planning
Moritz A. Graule, Volkan Isler
CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan
Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
Qi Qian, Yuanhong Xu, Juhua Hu
Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering
Sukmin Cho, Jeongyeon Seo, Soyeong Jeong, Jong C. Park
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
Sudarshan Babu, Richard Liu, Avery Zhou, Michael Maire, Greg Shakhnarovich, Rana Hanocka
Apollo: Zero-shot MultiModal Reasoning with Multiple Experts
Daniela Ben-David, Tzuf Paz-Argaman, Reut Tsarfaty
ChatGPT is a Potential Zero-Shot Dependency Parser
Boda Lin, Xinyi Zhou, Binghao Tang, Xiaocheng Gong, Si Li
Learning Robust Deep Visual Representations from EEG Brain Recordings
Prajwal Singh, Dwip Dalal, Gautam Vashishtha, Krishna Miyapuram, Shanmuganathan Raman
LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking
Zhenrui Yue, Sara Rabhi, Gabriel de Souza Pereira Moreira, Dong Wang, Even Oldridge
ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters
Vipul Rathore, Rajdeep Dhingra, Parag Singla, Mausam
GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions
Ting-Yao Hsu, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Lee Giles, Ting-Hao K. Huang
Videoprompter: an ensemble of foundational models for zero-shot video understanding
Adeel Yousaf, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
Text2Topic: Multi-Label Text Classification System for Efficient Topic Detection in User Generated Content with Zero-Shot Capabilities
Fengjun Wang, Moran Beladev, Ofri Kleinfeld, Elina Frayerman, Tal Shachar, Eran Fainman, Karen Lastmann Assaraf, Sarai Mizrachi, Benjamin Wang