首页 | 本学科首页   官方微博 | 高级检索  
     

基于矩阵的数据流Top-k频繁项集挖掘算法
引用本文:尹绍宏,范桂丹. 基于矩阵的数据流Top-k频繁项集挖掘算法[J]. 计算机工程, 2014, 0(3): 55-58,75
作者姓名:尹绍宏  范桂丹
作者单位:天津工业大学计算机科学与软件学院,天津300387
摘    要:传统的数据挖掘算法在挖掘频繁项集时会产生大量的冗余项集,影响挖掘效率。为此,提出一种基于矩阵的数据流Top-k频繁项集挖掘算法。引入2个0-1矩阵,即事务矩阵和二项集矩阵。采用事务矩阵表示滑动窗口模型中的事务列表,通过计算每行的支持度得到二项集矩阵。利用二项集矩阵得到候选项集,将事务矩阵中对应的行做逻辑与运算,计算出候选项集的支持度,从而得到Top-k频繁项集。把挖掘的结果存入数据字典中,当用户查询时,能够按支持度降序输出Top-k频繁项集。实验结果表明,该算法在挖掘过程中能避免冗余项集的产生,在保证正确率的前提下具有较高的时间效率。

关 键 词:数据挖掘  数据流  滑动窗口  矩阵  Top-k频繁项集

Top-k Frequent Itemsets Mining Algorithm over Data Streams Based on Matrix
YIN Shao-hong,FAN Gui-dan. Top-k Frequent Itemsets Mining Algorithm over Data Streams Based on Matrix[J]. Computer Engineering, 2014, 0(3): 55-58,75
Authors:YIN Shao-hong  FAN Gui-dan
Affiliation:(School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin 300387, China)
Abstract:The past algorithms produce large amounts of redundant itemsets, and they affect the efficiency of data mining. Therefore, a Top-k frequent itemsets mining algorithm over data streams based on matrix is proposed. Two 0-1 matrices, transaction matrix and 2-itemsets matrix, are introduced into the algorithm. Using transaction matrix to express the transaction list of a sliding window, and 2-itemsets matrix is obtained by calculating the support of each row. Then it can get candidate items by 2-itemsets matrix, and Top-k frequent itemsets are obtained by calculating the support of candidate items through logic and operation of correspond row in transaction matrix. Finally it saves the result of data mining into data dictionary. The algorithm can output the Top-k frequent itemsets by support in descendant order when user queries. Experimental results show that the algorithm avoids redundant itemsets in the process of data mining, and the efficiency of data mining is improved appreciably under the premise of accuracy.
Keywords:data mining  data stream  sliding window  matrix  Top-k frequent itemset
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号