挖掘数据流近似频繁项的改进算法 Improved algorithm for mining approximate frequent item over data streams期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

挖掘数据流近似频繁项的改进算法

引用本文：	王秀坤,王铁存,周国能,冯维.挖掘数据流近似频繁项的改进算法[J].计算机工程与应用,2008,44(13):150-152.

作者姓名：	王秀坤王铁存周国能冯维

作者单位：	大连理工大学计算机系，辽宁大连 116023

摘要：	数据流的无限性、连续性和速度快等特点，使得挖掘出所有准确的数据流频繁项通常是不可能的.算法的空间复杂度和时间复杂度通常是评价频繁项挖掘算法优劣的两个主要度量.通过引入局部性原理改进数据流近似频繁项的挖掘算法，该算法的空间复杂性为O（1/ε），数据流每个数据项的最坏处理时间是O（1/ε），其最好处理时间是O（1），输出结果的频率值误差为∑_(i=2)^j（1-μ_i）×k_i。
关键词：	数据流数据流挖掘频繁项
文章编号：	1002-8331(2008)13-0150-03
收稿时间：	2007-8-21
修稿时间：	2007年8月21日
Improved algorithm for mining approximate frequent item over data streams

WANG Xiu-kun,WANG Tie-cun,ZHOU Guo-neng,FENG Wei.Improved algorithm for mining approximate frequent item over data streams[J].Computer Engineering and Applications,2008,44(13):150-152.

Authors:	WANG Xiu-kun WANG Tie-cun ZHOU Guo-neng FENG Wei

Affiliation:	Department of Computer，Dalian University of Technology，Dalian，Liaoning 116023，China

Abstract:	Because of the rapid data arriving speed and huge size of data set in stream model，it is usually unable to find all the accurate frequent items of a data stream.The space complexity and the time complexity are the main measurement which is used to evaluate the strongpoints and weaknesses of algorithm.This paper proposes an improved algorithm based on principle of locality to find ε-approximate frequent items of a data stream，its space complexity is O（1/ε）.The processing time for each item is O（1/ε） in the worst and the processing time for each item is O（1） in the best.Moreover，the frequency error bound of the results returned by the proposed algorithm is ∑_(i=2)^j（1-μ_i）×k_i.

Keywords:	data stream data stream mining frequent item
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏