首页 | 本学科首页   官方微博 | 高级检索  
     


Learning block-preserving graph patterns and its application to data mining
Authors:Hitoshi Yamasaki  Yosuke Sasaki  Takayoshi Shoudai  Tomoyuki Uchida  Yusuke Suzuki
Affiliation:(1) Department of Informatics, Kyushu University, Fukuoka 819-0395, Japan;(2) Department of Intelligent Systems, Hiroshima City University, Hiroshima 731-3194, Japan
Abstract:Recently, due to the rapid growth of electronic data having graph structures such as HTML and XML texts and chemical compounds, many researchers have been interested in data mining and machine learning techniques for finding useful patterns from graph-structured data (graph data). Since graph data contain a huge number of substructures and it tends to be computationally expensive to decide whether or not such data have given structural features, graph mining problems face computational difficulties. Let ${\mathcal{C}}$ be a graph class which satisfies a connected hereditary property and contains infinitely many different biconnected graphs, and for which a special kind of the graph isomorphism problem can be computed in polynomial time. In this paper, we consider learning and mining problems for  ${\mathcal{C}}$ . Firstly, we define a new graph pattern, which is called a block preserving graph pattern (bp-graph pattern) for  ${\mathcal{C}}$ . Secondly, we present a polynomial time algorithm for deciding whether or not a given bp-graph pattern matches a given graph in  ${\mathcal{C}}$ . Thirdly, by giving refinement operators over bp-graph patterns, we present a polynomial time algorithm for finding a minimally generalized bp-graph pattern for  ${\mathcal{C}}$ . Outerplanar graphs are planar graphs which can be embedded in the plane in such a way that all of vertices lie on the outer boundary. Many pharmacologic chemical compounds are known to be represented by outerplanar graphs. The class of connected outerplanar graphs ${\mathcal{O}}$ satisfies the above conditions for  ${\mathcal{C}}$ . Next, we propose two incremental polynomial time algorithms for enumerating all frequent bp-graph patterns with respect to a given finite set of graphs in  ${\mathcal{O}}$ . Finally, by reporting experimental results obtained by applying the two graph mining algorithms to a subset of the NCI dataset, we evaluate the performance of the two graph mining algorithms.
Keywords:Pattern discovery  Graph mining  Graph-structured pattern  Inductive inference  Outerplanar graph
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号