Paper ID: 2412.00608

Leveraging LLM for Automated Ontology Extraction and Knowledge Graph Generation

Mohammad Sadeq Abolhasani, Rong Pan

Extracting relevant and structured knowledge from large, complex technical documents within the Reliability and Maintainability (RAM) domain is labor-intensive and prone to errors. Our work addresses this challenge by presenting OntoKGen, a genuine pipeline for ontology extraction and Knowledge Graph (KG) generation. OntoKGen leverages Large Language Models (LLMs) through an interactive user interface guided by our adaptive iterative Chain of Thought (CoT) algorithm to ensure that the ontology extraction process and, thus, KG generation align with user-specific requirements. Although KG generation follows a clear, structured path based on the confirmed ontology, there is no universally correct ontology as it is inherently based on the user's preferences. OntoKGen recommends an ontology grounded in best practices, minimizing user effort and providing valuable insights that may have been overlooked, all while giving the user complete control over the final ontology. Having generated the KG based on the confirmed ontology, OntoKGen enables seamless integration into schemeless, non-relational databases like Neo4j. This integration allows for flexible storage and retrieval of knowledge from diverse, unstructured sources, facilitating advanced querying, analysis, and decision-making. Moreover, the generated KG serves as a robust foundation for future integration into Retrieval Augmented Generation (RAG) systems, offering enhanced capabilities for developing domain-specific intelligent applications.

Submitted: Nov 30, 2024