首页 | 本学科首页   官方微博 | 高级检索  
     


From sequential pattern mining to structured pattern mining: A pattern-growth approach
Authors:Email author" target="_blank">Jia-Wei?HanEmail author  Jian?Pei  Xi-Feng?Yan
Affiliation:(1) University of Illinois at Urbana-Champaign, 61801 Urbana, IL, U.S.A.;(2) State University of New York at Buffalo, 14260-2000 Buffalo, NY, U.S.A.
Abstract:Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) acandidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential pattern mining method, and (ii) SPADE, a vertical format-based method; and (2) apattern-growth method, represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns.In this study, we perform a systematic introduction and presentation of the pattern-growth methodology and study its principles and extensions. We first introduce two interesting pattern-growth algorithms, FreeSpan and PrefixSpan, for efficient sequential pattern mining. Then we introduce gSpan for mining structured patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including mining multi-level, multi-dimensional patterns and mining constraint-based patterns.
Keywords:data mining  sequential pattern mining  structured pattern mining  scalability  performance analysis
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
点击此处可从《计算机科学技术学报》浏览原始摘要信息
点击此处可从《计算机科学技术学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号