Paper ID: 2303.14222

Lay Text Summarisation Using Natural Language Processing: A Narrative Literature Review

Oliver Vinzelberg, Mark David Jenkins, Gordon Morison, David McMinn, Zoe Tieges

Summarisation of research results in plain language is crucial for promoting public understanding of research findings. The use of Natural Language Processing to generate lay summaries has the potential to relieve researchers' workload and bridge the gap between science and society. The aim of this narrative literature review is to describe and compare the different text summarisation approaches used to generate lay summaries. We searched the databases Web of Science, Google Scholar, IEEE Xplore, Association for Computing Machinery Digital Library and arXiv for articles published until 6 May 2022. We included original studies on automatic text summarisation methods to generate lay summaries. We screened 82 articles and included eight relevant papers published between 2020 and 2021, all using the same dataset. The results show that transformer-based methods such as Bidirectional Encoder Representations from Transformers (BERT) and Pre-training with Extracted Gap-sentences for Abstractive Summarization (PEGASUS) dominate the landscape of lay text summarisation, with all but one study using these methods. A combination of extractive and abstractive summarisation methods in a hybrid approach was found to be most effective. Furthermore, pre-processing approaches to input text (e.g. applying extractive summarisation) or determining which sections of a text to include, appear critical. Evaluation metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE) were used, which do not consider readability. To conclude, automatic lay text summarisation is under-explored. Future research should consider long document lay text summarisation, including clinical trial reports, and the development of evaluation metrics that consider readability of the lay summary.

Submitted: Mar 24, 2023