Finding Itemset-Sharing Patterns in a Large Itemset-Associated Graph Authors: Mutsumi Fukuzaki, Mio Seki, Hisashi Kashima, Jun Sese
Abstract:
Itemset mining and graph mining have attracted considerable attention in the field of data mining, since they have many important applications in various areas such as biology, marketing, and social network analysis. However, most existing studies focus only on either itemset mining or graph mining, and only a few studies have addressed a combination of both. In this paper, we introduce a new problem which we call itemset-sharing subgraph (ISS) set enumeration, where the task is to find sets of subgraphs with common itemsets in a large graph in which each vertex has an associated itemset. The problem has various interesting potential applications such as in side-effect analysis in drug discovery and the analysis of the influence of word-of-mouth communication in marketing in social networks. We propose an efficient algorithm ROBIN for finding ISS sets in such graph; this algorithm enumerates connected subgraphs having common itemsets and finds their combinations. Experiments using a synthetic network verify that our method can efficiently process networks with more than one million edges. Experiments using a real biological network show that our algorithm can find biologically interesting patterns. We also apply ROBIN to a citation network and find successful collaborative research works.
If you think of a set of properties, “itemset,” as a topic and an “itemset-sharing subgraph (ISS)” as a match/merging criteria, the relevance of this paper to topic maps becomes immediately obvious.
Useful for both discovery of topics in data sets as well as part processing a topic map.