Paper ID: 2305.14087

BM25 Query Augmentation Learned End-to-End

Xiaoyin Chen, Sam Wiseman

Given BM25's enduring competitiveness as an information retrieval baseline, we investigate to what extent it can be even further improved by augmenting and re-weighting its sparse query-vector representation. We propose an approach to learning an augmentation and a re-weighting end-to-end, and we find that our approach improves performance over BM25 while retaining its speed. We furthermore find that the learned augmentations and re-weightings transfer well to unseen datasets.

Submitted: May 23, 2023