Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
Yuxuan Kuang, Junjie Ye, Haoran Geng, Jiageng Mao, Congyue Deng, Leonidas Guibas, He Wang, Yue Wang
Elevating All Zero-Shot Sketch-Based Image Retrieval Through Multimodal Prompt Learning
Mainak Singha, Ankit Jha, Divyam Gupta, Pranav Singla, Biplab Banerjee
Domain-Aware Fine-Tuning of Foundation Models
Ugur Ali Kaplan, Margret Keuper, Anna Khoreva, Dan Zhang, Yumeng Li
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta, Alberto Baldrati, Marco Bertini, Andrew D. Bagdanov
ZEAL: Surgical Skill Assessment with Zero-shot Tool Inference Using Unified Foundation Model
Satoshi Kondo
Semantic Compositions Enhance Vision-Language Contrastive Learning
Maxwell Aladago, Lorenzo Torresani, Soroush Vosoughi
Residual-MPPI: Online Policy Customization for Continuous Control
Pengcheng Wang, Chenran Li, Catherine Weaver, Kenta Kawamoto, Masayoshi Tomizuka, Chen Tang, Wei Zhan
ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment
Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang
Geode: A Zero-shot Geospatial Question-Answering Agent with Explicit Reasoning and Precise Spatio-Temporal Retrieval
Devashish Vikas Gupta, Azeez Syed Ali Ishaqui, Divya Kiran Kadiyala
Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation
Ahmed Njifenjou, Virgile Sucal, Bassam Jabaian, Fabrice Lefèvre
Zero-shot prompt-based classification: topic labeling in times of foundation models in German Tweets
Simon Münker, Kai Kugler, Achim Rettinger
Boosting Soft Q-Learning by Bounding
Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, Rahul V. Kulkarni
Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model
Zhuo Zheng, Stefano Ermon, Dongjun Kim, Liangpei Zhang, Yanfei Zhong