首页 | 本学科首页   官方微博 | 高级检索  
     

数据流中基于滑动窗口的序列模式挖掘算法
引用本文:谢伙生,何星星.数据流中基于滑动窗口的序列模式挖掘算法[J].计算机工程与应用,2012,48(4):121-124.
作者姓名:谢伙生  何星星
作者单位:福州大学 数学与计算机科学学院,福州 350108
摘    要:序列模式发现是最重要的数据挖掘任务之一,并有着广阔的应用前景。针对静态数据库,序列模式挖掘已经被深入地研究,但针对基于数据流的序列模式挖掘的研究还不是十分深入。数据流有着无限性的特性,因此往往不能保存数据流中全部的数据,同时很多时候只对最近的时间段的序列模式感兴趣,提出一个有效的结合滑动窗口技术的挖掘序列模式的算法FPM-SW,算法利用到3个数据结构(PatternTable,CountTable和Ta-tree)来处理基于数据流的序列模式挖掘的复杂性问题。算法通过CountTable结构来保存以往的潜在频繁序列,考虑到在某些情况下CountTable占用内存过多,算法还结合了一种压缩CountTable技术来减少内存占用。FPM-SW的优点是可以最大限度地降低负正例的产生,实验表明FPM-SW具有较高的准确率。

关 键 词:序列模式  数据流挖掘  滑动窗口  
修稿时间: 

Efficient algorithm for mining frequent sequential pattern based on sliding window in data stream
XIE Huosheng , HE Xingxing.Efficient algorithm for mining frequent sequential pattern based on sliding window in data stream[J].Computer Engineering and Applications,2012,48(4):121-124.
Authors:XIE Huosheng  HE Xingxing
Affiliation:College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, China
Abstract:Sequential pattern mining is one of the most important tasks of data mining and has broad applications. Sequential pattern mining has been studied extensively in static databases. However, the study of sequential pattern mining based on data streams is not very deep. Stream data has the characteristic of unlimited flow, it can not save all the data, and people usually are interested in the sequential patterns in recent time period, accordingly it introduces one effective method combining with sliding window technique for mining sequential patterns from data streams: FPM-SW algorithm(Frequent Pattern Mining-Sliding Window). It uses three data structures (PatternTable, CountTable and Ta-tree) to handle the complexities of mining frequent sequential patterns in data streams. FPM-SW algorithm uses CountTable structure to preserve the past potential frequent sequences, considering that in some cases the countTable uses too much memory, the algorithm also combines a CountTable compression techniques to reduce memory footprint. The excellence of the algorithm is that it can maximize the reduction of the number of false positive. Experimental results show that FPM-SW has higher accuracy.
Keywords:sequential patterns  data stream mining  sliding window
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号