Paper ID: 2112.09573

cgSpan: Closed Graph-Based Substructure Pattern Mining

Zevin Shaul, Sheikh Naaz

gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (closed graph-based substructure pattern mining) is a gSpan extension that only mines closed subgraphs. A subgraph g is closed in the graphs database if there is no proper frequent supergraph of g that has equivalent occurrence with g. cgSpan adds the Early Termination pruning method to the gSpan pruning methods, while leaving the original gSpan steps unchanged. cgSpan also detects and handles cases in which Early Termination should not be applied. To the best of our knowledge, cgSpan is the first publicly available implementation for closed graphs mining

Submitted: Dec 17, 2021