Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models
Yizhi LI, Ge Zhang, Xingwei Qu, Jiali Li, Zhaoqun Li, Zekun Wang, Hao Li, Ruibin Yuan, Yinghao Ma, Kai Zhang, Wangchunshu Zhou, Yiming Liang, Lei Zhang, Lei Ma, Jiajun Zhang, Zuowen Li, Stephen W. Huang, Chenghua Lin, Jie Fu
Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy
Liyan Xu, Zhenlin Su, Mo Yu, Jin Xu, Jinho D. Choi, Jie Zhou, Fei Liu
HumanEval on Latest GPT Models -- 2024
Daniel Li, Lincoln Murr
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports
Felix J. Dorfner, Liv Jürgensen, Leonhard Donle, Fares Al Mohamad, Tobias R. Bodenmann, Mason C. Cleveland, Felix Busch, Lisa C. Adams, James Sato, Thomas Schultz, Albert E. Kim, Jameson Merkow, Keno K. Bressem, Christopher P. Bridge
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs
Kai Wang, Yuwei Xu, Zhiyong Wu, Siqiang Luo
GIM: Learning Generalizable Image Matcher From Internet Videos
Xuelun Shen, Zhipeng Cai, Wei Yin, Matthias Müller, Zijun Li, Kaixuan Wang, Xiaozhi Chen, Cheng Wang
OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models
Yuxuan Kuang, Hai Lin, Meng Jiang
Assessing biomedical knowledge robustness in large language models by query-efficient sampling attacks
R. Patrick Xian, Alex J. Lee, Satvik Lolla, Vincent Wang, Qiming Cui, Russell Ro, Reza Abbasi-Asl
Learning How To Ask: Cycle-Consistency Refines Prompts in Multimodal Foundation Models
Maurice Diesendruck, Jianzhe Lin, Shima Imani, Gayathri Mahalingam, Mingyang Xu, Jie Zhao
Benchmarking multi-component signal processing methods in the time-frequency plane
Juan M. Miramont, Rémi Bardenet, Pierre Chainais, Francois Auger
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models
Shaeke Salman, Md Montasir Bin Shams, Xiuwen Liu, Lingjiong Zhu
One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill
Sangwoo Shin, Daehee Lee, Minjong Yoo, Woo Kyung Kim, Honguk Woo
Foundational Inference Models for Dynamical Systems
Patrick Seifner, Kostadin Cvejoski, Ramses J. Sanchez
Only the Curve Shape Matters: Training Foundation Models for Zero-Shot Multivariate Time Series Forecasting through Next Curve Shape Prediction
Cheng Feng, Long Huang, Denis Krompass
An Empirical Study Into What Matters for Calibrating Vision-Language Models
Weijie Tu, Weijian Deng, Dylan Campbell, Stephen Gould, Tom Gedeon
On the Out-Of-Distribution Generalization of Multimodal Large Language Models
Xingxuan Zhang, Jiansheng Li, Wenjing Chu, Junjia Hai, Renzhe Xu, Yuqing Yang, Shikai Guan, Jiazheng Xu, Peng Cui
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson