Paper ID: 2206.02238

OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes

David Geleta, Andriy Nikolov, Mark ODonoghue, Benedek Rozemberczki, Anna Gogleva, Valentina Tamma, Terry R. Payne

Duplication of nodes is a common problem encountered when building knowledge graphs (KGs) from heterogeneous datasets, where it is crucial to be able to merge nodes having the same meaning. OntoMerger is a Python ontology integration library whose functionality is to deduplicate KG nodes. Our approach takes a set of KG nodes, mappings and disconnected hierarchies and generates a set of merged nodes together with a connected hierarchy. In addition, the library provides analytic and data testing functionalities that can be used to fine-tune the inputs, further reducing duplication, and to increase connectivity of the output graph. OntoMerger can be applied to a wide variety of ontologies and KGs. In this paper we introduce OntoMerger and illustrate its functionality on a real-world biomedical KG.

Submitted: Jun 5, 2022