首页 | 本学科首页   官方微博 | 高级检索  
     

挖掘数据流近似频繁项的改进算法
引用本文:王秀坤,王铁存,周国能,冯维. 挖掘数据流近似频繁项的改进算法[J]. 计算机工程与应用, 2008, 44(13): 150-152. DOI: 10.3778/j.issn.1002-8331.2008.13.045
作者姓名:王秀坤  王铁存  周国能  冯维
作者单位:大连理工大学,计算机系,辽宁,大连,116023;大连理工大学,计算机系,辽宁,大连,116023;大连理工大学,计算机系,辽宁,大连,116023;大连理工大学,计算机系,辽宁,大连,116023
摘    要:数据流的无限性、连续性和速度快等特点,使得挖掘出所有准确的数据流频繁项通常是不可能的.算法的空间复杂度和时间复杂度通常是评价频繁项挖掘算法优劣的两个主要度量.通过引入局部性原理改进数据流近似频繁项的挖掘算法,该算法的空间复杂性为O(1/ε),数据流每个数据项的最坏处理时间是O(1/ε),其最好处理时间是O(1),输出结果的频率值误差为∑_(i=2)^j(1-μi)×ki

关 键 词:数据流  数据流挖掘  频繁项
文章编号:1002-8331(2008)13-0150-03
收稿时间:2007-08-21
修稿时间:2007-08-21

Improved algorithm for mining approximate frequent item over data streams
WANG Xiu-kun,WANG Tie-cun,ZHOU Guo-neng,FENG Wei. Improved algorithm for mining approximate frequent item over data streams[J]. Computer Engineering and Applications, 2008, 44(13): 150-152. DOI: 10.3778/j.issn.1002-8331.2008.13.045
Authors:WANG Xiu-kun  WANG Tie-cun  ZHOU Guo-neng  FENG Wei
Affiliation:Department of Computer,Dalian University of Technology,Dalian,Liaoning 116023,China
Abstract:Because of the rapid data arriving speed and huge size of data set in stream model,it is usually unable to find all the accurate frequent items of a data stream.The space complexity and the time complexity are the main measurement which is used to evaluate the strongpoints and weaknesses of algorithm.This paper proposes an improved algorithm based on principle of locality to find ε-approximate frequent items of a data stream,its space complexity is O(1/ε).The processing time for each item is O(1/ε) in the worst and the processing time for each item is O(1) in the best.Moreover,the frequency error bound of the results returned by the proposed algorithm is ∑_(i=2)^j(1-μi)×ki.
Keywords:data stream  data stream mining  frequent item
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号