Zero Shot
Zero-shot learning aims to enable models to perform tasks on unseen data without any task-specific training, leveraging pre-trained knowledge to generalize to new situations. Current research focuses on improving zero-shot capabilities across diverse modalities (vision, language, audio) using large language models (LLMs), vision-language models (VLMs), and diffusion models, often incorporating techniques like chain-of-thought prompting, knowledge retrieval, and prompt engineering to enhance performance and interpretability. This field is significant because it promises more efficient and adaptable AI systems, impacting various applications from image editing and medical diagnosis to robotics and natural language processing.
Papers
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks
Santiago Castro, Fabian Caba Heilbron
CLIP-Mesh: Generating textured meshes from text using pretrained image-text models
Nasir Mohammad Khalid, Tianhao Xie, Eugene Belilovsky, Tiberiu Popa
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models
Kanishka Misra
Factored World Models for Zero-Shot Generalization in Robotic Manipulation
Ondrej Biza, Thomas Kipf, David Klee, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong
Distilling Hypernymy Relations from Language Models: On the Effectiveness of Zero-Shot Taxonomy Induction
Devansh Jain, Luis Espinosa Anke