Paper ID: 2402.13623

FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models

Sahil Mishra, Ujjwal Sudev, Tanmoy Chakraborty

Taxonomies represent an arborescence hierarchical structure that establishes relationships among entities to convey knowledge within a specific domain. Each edge in the taxonomy signifies a hypernym-hyponym relationship. Taxonomies find utility in various real-world applications, such as e-commerce search engines and recommendation systems. Consequently, there arises a necessity to enhance these taxonomies over time. However, manually curating taxonomies with neoteric data presents challenges due to limitations in available human resources and the exponential growth of data. Therefore, it becomes imperative to develop automatic taxonomy expansion methods. Traditional supervised taxonomy expansion approaches encounter difficulties stemming from limited resources, primarily due to the small size of existing taxonomies. This scarcity of training data often leads to overfitting. In this paper, we propose FLAME, a novel approach for taxonomy expansion in low-resource environments by harnessing the capabilities of large language models that are trained on extensive real-world knowledge. LLMs help compensate for the scarcity of domain-specific knowledge. Specifically, FLAME leverages prompting in few-shot settings to extract the inherent knowledge within the LLMs, ascertaining the hypernym entities within the taxonomy. Furthermore, it employs reinforcement learning to fine-tune the large language models, resulting in more accurate predictions. Experiments on three real-world benchmark datasets demonstrate the effectiveness of FLAME in real-world scenarios, achieving a remarkable improvement of 18.5% in accuracy and 12.3% in Wu & Palmer metric over eight baselines. Furthermore, we elucidate the strengths and weaknesses of FLAME through an extensive case study, error analysis and ablation studies on the benchmarks.

Submitted: Feb 21, 2024