Improving the dual cardinality simulation algorithms
Bernaola Ibarra, Luis Anggelo
MetadataShow full item record
Graph pattern matching is typically defined in terms of subgraph isomorphism, which makes it an NP-complete/NP-hard problem. Isomorphism algorithms requires bijective functions which can be too restrictive to identify patterns in real-world applications. Moreover, real-world graphs may contain some noise and the problem of finding the exact match can be very expensive. In order to avoid the combinatorial worst-case time complexity of subgraph isomorphism, we extend prior work on dual cardinality simulation. According to our experiments, this type of graph simulation offers high precision with good performance in large graphs. Precision is acquired because dual cardinality simulation checks the constraint in which the number of matching children or parents with the same label in the data graph should not be less than their correspondents in the query graphs. For improving the performance, we have introduced the concept of count sets which are computed before dual cardinality simulation is executed. Experiments are done on large graphs using synthetic and real-world graphs.