Paper ID: 2308.03103

Improving Domain-Specific Retrieval by NLI Fine-Tuning

Roman Dušek, Aleksander Wawer, Christopher Galias, Lidia Wojciechowska

The aim of this article is to investigate the fine-tuning potential of natural language inference (NLI) data to improve information retrieval and ranking. We demonstrate this for both English and Polish languages, using data from one of the largest Polish e-commerce sites and selected open-domain datasets. We employ both monolingual and multilingual sentence encoders fine-tuned by a supervised method utilizing contrastive loss and NLI data. Our results point to the fact that NLI fine-tuning increases the performance of the models in both tasks and both languages, with the potential to improve mono- and multilingual models. Finally, we investigate uniformity and alignment of the embeddings to explain the effect of NLI-based fine-tuning for an out-of-domain use-case.

Submitted: Aug 6, 2023

Topics

Domain Specific
Natural Language Inference
Multilingual Model
Multilingual Sentence Encoders

Links

arXiv PDF