Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
Can We Estimate Purchase Intention Based on Zero-shot Speech Emotion Recognition?
Ryotaro Nagase, Takashi Sumiyoshi, Natsuo Yamashita, Kota Dohi, Yohei Kawaguchi
DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning
Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen
Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models
Sathya Kamesh Bhethanabhotla, Omar Swelam, Julien Siems, David Salinas, Frank Hutter
SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation
Haosheng Li, Weixin Mao, Weipeng Deng, Chenyu Meng, Rui Zhang, Fan Jia, Tiancai Wang, Haoqiang Fan, Hongan Wang, Xiaoming Deng
Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers
Jin Cao, Deyu Meng, Xiangyong Cao
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering
Francesco Maria Molfese, Simone Conia, Riccardo Orlando, Roberto Navigli
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing
Feng Tian, Yixuan Li, Yichao Yan, Shanyan Guan, Yanhao Ge, Xiaokang Yang
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration
Lai Wei, Wenkai Wang, Xiaoyu Shen, Yu Xie, Zhihao Fan, Xiaojin Zhang, Zhongyu Wei, Wei Chen
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura, Takumi Hirose, Masanari Ohi, Hideki Nakayama, Nakamasa Inoue
Noise Crystallization and Liquid Noise: Zero-shot Video Generation using Image Diffusion Models
Muhammad Haaris Khan, Hadrien Reynaud, Bernhard Kainz
PsFuture: A Pseudo-Future-based Zero-Shot Adaptive Policy for Simultaneous Machine Translation
Libo Zhao, Jing Li, Ziqian Zeng
ForgeryTTT: Zero-Shot Image Manipulation Localization with Test-Time Training
Weihuang Liu, Xi Shen, Chi-Man Pun, Xiaodong Cun