Paper ID: 2409.01087

Pre-Trained Language Models for Keyphrase Prediction: A Review

Muhammad Umair, Tangina Sultana, Young-Koo Lee

Keyphrase Prediction (KP) is essential for identifying keyphrases in a document that can summarize its content. However, recent Natural Language Processing (NLP) advances have developed more efficient KP models using deep learning techniques. The limitation of a comprehensive exploration jointly both keyphrase extraction and generation using pre-trained language models spotlights a critical gap in the literature, compelling our survey paper to bridge this deficiency and offer a unified and in-depth analysis to address limitations in previous surveys. This paper extensively examines the topic of pre-trained language models for keyphrase prediction (PLM-KP), which are trained on large text corpora via different learning (supervisor, unsupervised, semi-supervised, and self-supervised) techniques, to provide respective insights into these two types of tasks in NLP, precisely, Keyphrase Extraction (KPE) and Keyphrase Generation (KPG). We introduce appropriate taxonomies for PLM-KPE and KPG to highlight these two main tasks of NLP. Moreover, we point out some promising future directions for predicting keyphrases.

Submitted: Sep 2, 2024