Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
Zero-shot racially balanced dataset generation using an existing biased StyleGAN2
Anubhav Jain, Nasir Memon, Julian Togelius
Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training
Ziwei Fan, Zhiwei Liu, Shelby Heinecke, Jianguo Zhang, Huan Wang, Caiming Xiong, Philip S. Yu
ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
Wei-Lin Chen, An-Zi Yen, Cheng-Kuang Wu, Hen-Hsen Huang, Hsin-Hsi Chen
Meta-Optimization for Higher Model Generalizability in Single-Image Depth Prediction
Cho-Ying Wu, Yiqi Zhong, Junying Wang, Ulrich Neumann
MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition
Xinyu Gong, Sreyas Mohan, Naina Dhingra, Jean-Charles Bazin, Yilei Li, Zhangyang Wang, Rakesh Ranjan
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra
CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding
Yixiao Ma, Yueyue Wu, Weihang Su, Qingyao Ai, Yiqun Liu
Boosting Zero-shot Cross-lingual Retrieval by Training on Artificially Code-Switched Data
Robert Litschko, Ekaterina Artemova, Barbara Plank
Zero-shot personalized lip-to-speech synthesis with face image based voice control
Zheng-Yan Sheng, Yang Ai, Zhen-Hua Ling
Towards Zero-Shot Frame Semantic Parsing with Task Agnostic Ontologies and Simple Labels
Danilo Ribeiro, Omid Abdar, Jack Goetz, Mike Ross, Annie Dong, Kenneth Forbus, Ahmed Mohamed
Otter: A Multi-Modal Model with In-Context Instruction Tuning
Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu
Zero-shot performance of the Segment Anything Model (SAM) in 2D medical imaging: A comprehensive evaluation and practical guidelines
Christian Mattjie, Luis Vinicius de Moura, Rafaela Cappelari Ravazio, Lucas Silveira Kupssinskü, Otávio Parraga, Marcelo Mussi Delucis, Rodrigo Coelho Barros
SAM on Medical Images: A Comprehensive Study on Three Prompt Modes
Dongjie Cheng, Ziyuan Qin, Zekun Jiang, Shaoting Zhang, Qicheng Lao, Kang Li