Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
Can Rule-Based Insights Enhance LLMs for Radiology Report Classification? Introducing the RadPrompt Methodology
Panagiotis Fytas, Anna Breger, Ian Selby, Simon Baker, Shahab Shahipasand, Anna Korhonen
Zero-shot Factual Consistency Evaluation Across Domains
Raunak Agarwal
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Yicheng Zhu, Keren Ye, Junjie Ke, Jiahui Yu, Leonidas Guibas, Peyman Milanfar, Feng Yang
AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis
Townim F. Chowdhury, Vu Minh Hieu Phan, Kewen Liao, Minh-Son To, Yutong Xie, Anton van den Hengel, Johan W. Verjans, Zhibin Liao
Visual Grounding for Object-Level Generalization in Reinforcement Learning
Haobin Jiang, Zongqing Lu
Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper)
Bin Han, Yiwei Yang, Anat Caspi, Bill Howe
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data
Yichen Lu, Jiaqi Song, Xuankai Chang, Hengwei Bian, Soumi Maiti, Shinji Watanabe
Fine-gained Zero-shot Video Sampling
Dengsheng Chen, Jie Hu, Xiaoming Wei, Enhua Wu
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs
Elan Markowitz, Anil Ramakrishna, Jwala Dhamala, Ninareh Mehrabi, Charith Peris, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing
Ekaterina Iakovleva, Fabio Pizzati, Philip Torr, Stéphane Lathuilière
Leveraging Foundation Models for Zero-Shot IoT Sensing
Dinghao Xue, Xiaoran Fan, Tao Chen, Guohao Lan, Qun Song
AgEval: A Benchmark for Zero-Shot and Few-Shot Plant Stress Phenotyping with Multimodal LLMs
Muhammad Arbab Arshad, Talukder Zaki Jubery, Tirtho Roy, Rim Nassiri, Asheesh K. Singh, Arti Singh, Chinmay Hegde, Baskar Ganapathysubramanian, Aditya Balu, Adarsh Krishnamurthy, Soumik Sarkar
Robust Claim Verification Through Fact Detection
Nazanin Jafari, James Allan
I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition
Yannis Vasilakis, Rachel Bittner, Johan Pauwels
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation
Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Gross, Konrad Schindler, Christopher Schroers
Scaling A Simple Approach to Zero-Shot Speech Recognition
Jinming Zhao, Vineel Pratap, Michael Auli