Paper ID: 2406.12494

LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization

Masafumi Enomoto, Kunihiro Takeoka, Kosuke Akimoto, Kiril Gashteovski, Masafumi Oyamada

Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. This task is crucial for efficiently addressing diverse information needs from users. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. These queries often require broader context than initially retrieved passages provide, making it challenging to retrieve all relevant information in a single search. While iterative retrieval methods has been explored for multi-hop question answering (MQA), it's impractical for ODMDS due to high latency from repeated LLM inference. Accordingly, we propose LightPAL, a lightweight passage retrieval method for ODMDS. LightPAL leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference. Experiments demonstrate that LightPAL outperforms naive sparse and pre-trained dense retrievers in both retrieval and summarization metrics, while achieving higher efficiency compared to iterative MQA approaches.

Submitted: Jun 18, 2024