Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer
Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, Yejin Choi
MAVE: A Product Dataset for Multi-source Attribute Value Extraction
Li Yang, Qifan Wang, Zac Yu, Anand Kulkarni, Sumit Sanghai, Bin Shu, Jon Elsas, Bhargav Kanagal
Extreme Zero-Shot Learning for Extreme Text Classification
Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon
Modality-Aware Triplet Hard Mining for Zero-shot Sketch-Based Image Retrieval
Zongheng Huang, YiFan Sun, Chuchu Han, Changxin Gao, Nong Sang
Decoupling Zero-Shot Semantic Segmentation
Jian Ding, Nan Xue, Gui-Song Xia, Dengxin Dai
Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov