Dense Retriever
Dense retrieval focuses on efficiently finding relevant information (e.g., documents, passages) from large datasets by representing both queries and data points as dense vectors, enabling fast similarity comparisons. Current research emphasizes improving the accuracy and efficiency of these methods, exploring techniques like contrastive learning, knowledge distillation, and the integration of large language models (LLMs) to enhance retrieval performance, particularly in low-resource or zero-shot scenarios. These advancements are significant for various applications, including question answering, conversational search, and biomedical literature search, by enabling faster and more accurate information access.
Papers
What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary
Ori Ram, Liat Bezalel, Adi Zicher, Yonatan Belinkov, Jonathan Berant, Amir Globerson
Adam: Dense Retrieval Distillation with Adaptive Dark Examples
Chongyang Tao, Chang Liu, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers
Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen
Retrieval-based Disentangled Representation Learning with Natural Language Supervision
Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen
Boosted Dense Retriever
Patrick Lewis, Barlas Oğuz, Wenhan Xiong, Fabio Petroni, Wen-tau Yih, Sebastian Riedel
Learning to Retrieve Passages without Supervision
Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych