Dense Retriever
Dense retrieval focuses on efficiently finding relevant information (e.g., documents, passages) from large datasets by representing both queries and data points as dense vectors, enabling fast similarity comparisons. Current research emphasizes improving the accuracy and efficiency of these methods, exploring techniques like contrastive learning, knowledge distillation, and the integration of large language models (LLMs) to enhance retrieval performance, particularly in low-resource or zero-shot scenarios. These advancements are significant for various applications, including question answering, conversational search, and biomedical literature search, by enabling faster and more accurate information access.
Papers
MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers
Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen
Retrieval-based Disentangled Representation Learning with Natural Language Supervision
Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen
Boosted Dense Retriever
Patrick Lewis, Barlas Oğuz, Wenhan Xiong, Fabio Petroni, Wen-tau Yih, Sebastian Riedel
Learning to Retrieve Passages without Supervision
Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych