Paper ID: 2209.13738

mRobust04: A Multilingual Version of the TREC Robust 2004 Benchmark

Vitor Jeronymo, Mauricio Nascimento, Roberto Lotufo, Rodrigo Nogueira

Robust 2004 is an information retrieval benchmark whose large number of judgments per query make it a reliable evaluation dataset. In this paper, we present mRobust04, a multilingual version of Robust04 that was translated to 8 languages using Google Translate. We also provide results of three different multilingual retrievers on this dataset. The dataset is available at https://huggingface.co/datasets/unicamp-dl/mrobust

Submitted: Sep 27, 2022