Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion
Vitor Guizilini, Pavel Tokmakov, Achal Dave, Rares Ambrus
PROSE-FD: A Multimodal PDE Foundation Model for Learning Multiple Operators for Forecasting Fluid Dynamics
Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, Hayden Schaeffer
Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data
Bastián González-Bustamante
Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based Recognition
Zongyou Yu, Qiang Qu, Xiaoming Chen, Chen Wang
DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors
Thomas Hanwen Zhu, Ruining Li, Tomas Jakab
From Explanations to Action: A Zero-Shot, Theory-Driven LLM Framework for Student Performance Feedback
Vinitra Swamy, Davide Romano, Bhargav Srinivasa Desikan, Oana-Maria Camburu, Tanja Käser
Zero-Shot Machine-Generated Text Detection Using Mixture of Large Language Models
Matthieu Dubois, François Yvon, Pablo Piantanida
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu
An Art-centric perspective on AI-based content moderation of nudity
Piera Riccio, Georgina Curto, Thomas Hofmann, Nuria Oliver
MAGDA: Multi-agent guideline-driven diagnostic assistance
David Bani-Harouni, Nassir Navab, Matthias Keicher
Robust Agility via Learned Zero Dynamics Policies
Noel Csomay-Shanklin, William D. Compton, Ivan Dario Jimenez Rodriguez, Eric R. Ambrose, Yisong Yue, Aaron D. Ames
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments
Haritheja Etukuru, Norihito Naka, Zijin Hu, Seungjae Lee, Julian Mehu, Aaron Edsinger, Chris Paxton, Soumith Chintala, Lerrel Pinto, Nur Muhammad Mahi Shafiullah
Evaluating Multiview Object Consistency in Humans and Image Models
Tyler Bonnen, Stephanie Fu, Yutong Bai, Thomas O'Connell, Yoni Friedman, Nancy Kanwisher, Joshua B. Tenenbaum, Alexei A. Efros
EndoOmni: Zero-Shot Cross-Dataset Depth Estimation in Endoscopy by Robust Self-Learning from Noisy Labels
Qingyao Tian, Zhen Chen, Huai Liao, Xinyan Huang, Lujie Li, Sebastien Ourselin, Hongbin Liu
From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models
Tessa Pulli, Stefan Thalhammer, Simon Schwaiger, Markus Vincze