首页 | 本学科首页   官方微博 | 高级检索  
     

改进的多数据流协同频繁项集挖掘算法
引用本文:王鑫,刘方爱.改进的多数据流协同频繁项集挖掘算法[J].计算机应用,2016,36(7):1988-1992.
作者姓名:王鑫  刘方爱
作者单位:1. 山东师范大学 信息科学与工程学院, 济南 250014;2. 山东省分布式计算机软件新技术重点实验室(山东师范大学), 济南 250014
基金项目:国家自然科学基金资助项目(61572301,90612003);山东省自然科学基金资助项目(ZR2013FM008)。
摘    要:针对已有的多数据流协同频繁项集挖掘算法存在内存占用率高以及发现频繁项集效率低的问题,提出了改进的多数据流协同频繁项集挖掘(MCMD-Stream)算法。首先,该算法利用单遍扫描数据库的字节序列滑动窗口挖掘算法发现数据流中的潜在频繁项集和频繁项集;其次,构建类似频繁模式树(FP-Tree)的压缩频繁模式树(CP-Tree)存储已发现的潜在频繁项集和频繁项集,同时更新CP-Tree树中每个节点生成的对数倾斜时间表中的频繁项计数;最后,通过汇总分析得出在多条数据流中多次出现的且有价值的频繁项集,即协同频繁项集。相比A-Stream和H-Stream算法,MCMD-Stream算法不仅能够提高多数据流中协同频繁项集挖掘的效率,并且还降低了内存空间的使用率。实验结果表明MCMD-Stream算法能够有效地应用于多数据流的协同频繁项集挖掘。

关 键 词:流数据挖掘  多数据流  滑动窗口  频繁项集  协同频繁项集  
收稿时间:2015-12-02
修稿时间:2016-03-01

Improved algorithm for mining collaborative frequent itemsets in multiple data streams
WANG Xin,LIU Fang'ai.Improved algorithm for mining collaborative frequent itemsets in multiple data streams[J].journal of Computer Applications,2016,36(7):1988-1992.
Authors:WANG Xin  LIU Fang'ai
Affiliation:1. College of Information Science and Engineering, Shandong Normal University, Jinan Shandong 250014, China;2. Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology (Shandong Normal University), Jinan Shandong 250014, China
Abstract:In view of low memory usage rate and inefficient discovery for mining frequent itemsets in multiple data streams, an improved Mining Collaborative frequent itemsets in Multiple Data Stream (MCMD-Stream) algorithm was proposed. Firstly, the window sliding based on bit-sequence technique was utilized, which was a single-pass algorithm to find the potential and frequent itemsets. Secondly, Compressed frequent Pattern Tree (CP-Tree), which is similar to Frequent Pattern Tree (FP-Tree), was constructed to store the potential and frequent itemsets. And each node in the CP-Tree could generate the logarithmic tilted window to save the counts of frequent itemsets. Finally, the valuable frequent itemsets that appeared repeatedly in multiple data streams, namely collaborative frequent itemsets, were got. Compared to A-Stream and H-Stream algorithms, MCMD-Stream algorithm can improve the mining efficiency of collaborative frequent itemsets in multiple data streams, and also reduce the usage of the memory space. The experimental results show that MCMD-Stream algorithm can efficiently be applied to mine the collaborative frequent itemsets in multiple data streams.
Keywords:stream data mining                                                                                                                        multiple data stream                                                                                                                        sliding window                                                                                                                        frequent itemset                                                                                                                        collaborative frequent itemset
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号