首页 | 本学科首页   官方微博 | 高级检索  
     


Fast discovery of sequential patterns in large databases using effective time-indexing
Authors:Ming-Yen Lin  Sue-Chen Hsueh
Affiliation:a Department of Information Engineering and Computer Science, Feng Chia University, No. 100 Wenhwa Road, Seatwen, Taichung 40724, Taiwan
b Department of Information Management, Chaoyang University of Technology, Taiwan
Abstract:Sequential pattern mining algorithms can often produce more accurate results if they work with specific constraints in addition to the support threshold. Many systems implement time-independent constraints by selecting qualified patterns. This selection cannot implement time-dependent constraints, because the support computation process must validate the time attributes of every data sequence during mining. Therefore, we propose a memory time-indexing approach, called METISP, to discover sequential patterns with time constraints including minimum-gap, maximum-gap, exact-gap, sliding window, and duration constraints. METISP scans the database into memory and constructs time-index sets for effective processing. METISP uses index sets and a pattern-growth strategy to mine patterns without generating any candidates or sub-databases. The index sets narrow down the search space to the sets of designated in-memory data sequences, and speed up the counting of potential items within the indicated ranges. Our comprehensive experiments show that METISP has better efficiency, even with low support and large databases, than the well-known GSP and DELISP algorithms. METISP scales up linearly with respect to database size.
Keywords:Sequential patterns   Time constraint   Time-index   Sequence mining   Pattern-growth
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号