Paper ID: 2402.11518
Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network
Lin Chen, Fengli Xu, Nian Li, Zhenyu Han, Meng Wang, Yong Li, Pan Hui
Heterogeneous information networks (HIN) have gained increasing popularity in recent years for capturing complex relations between diverse types of nodes. Meta-structures are proposed as a useful tool to identify the important patterns in HINs, but hand-crafted meta-structures pose significant challenges for scaling up, drawing wide research attention towards developing automatic search algorithms. Previous efforts primarily focused on searching for meta-structures with good empirical performance, overlooking the importance of human comprehensibility and generalizability. To address this challenge, we draw inspiration from the emergent reasoning abilities of large language models (LLMs). We propose ReStruct, a meta-structure search framework that integrates LLM reasoning into the evolutionary procedure. ReStruct uses a grammar translator to encode the meta-structures into natural language sentences, and leverages the reasoning power of LLMs to evaluate their semantic feasibility. Besides, ReStruct also employs performance-oriented evolutionary operations. These two competing forces allow ReStruct to jointly optimize the semantic explainability and empirical performance of meta-structures. Furthermore, ReStruct contains a differential LLM explainer to generate and refine natural language explanations for the discovered meta-structures by reasoning through the search history. Experiments on eight representative HIN datasets demonstrate that ReStruct achieves state-of-the-art performance in both recommendation and node classification tasks. Moreover, a survey study involving 73 graduate students shows that the discovered meta-structures and generated explanations by ReStruct are substantially more comprehensible. Our code and questionnaire are available at https://github.com/LinChen-65/ReStruct.
Submitted: Feb 18, 2024