流数据上的频繁项挖掘算法 Mining frequent items on stream data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

流数据上的频繁项挖掘算法

引用本文：	屠莉,陈崚. 流数据上的频繁项挖掘算法[J]. 计算机应用, 2011, 31(2): 450-453. DOI: 10.3724/SP.J.1087.2011.00450

作者姓名：	屠莉陈崚

作者单位：	1. 江阴职业技术学院2. 扬州大学信息工程学院; 南京大学计算机软件新技术国家重点实验室

基金项目：	国家自然科学基金资助项目，江苏省自然科学基金资助项目，江苏省教育厅自然科学基金资助项目，江苏省普通高校研究生科研创新计划项目

摘要：	提出了一种流数据上的频繁项挖掘算法（SW-COUNT）。该算法通过数据采样技术挖掘滑动窗口下的数据流频繁项。给定的误差ε，SW-COUNT可以在O(ε-1)空间复杂度下，检测误差在εn内的数据流频繁项，对每个数据项的平均处理时间为O(1)。大量的实验证明，该算法比其他类似算法具有较好的精度质量以及时间和空间效率。
关键词：	数据流频繁项滑动窗口采样技术数据挖掘
收稿时间：	2010-07-19
修稿时间：	2010-09-13
Mining frequent items on stream data

TU Li,CHEN Ling. Mining frequent items on stream data[J]. Journal of Computer Applications, 2011, 31(2): 450-453. DOI: 10.3724/SP.J.1087.2011.00450

Authors:	TU Li CHEN Ling

Affiliation:	Ling2,3(1.Department of Computer Science,Jiangyin Polytechnic Institute,Jiangyin Jiangsu 214405,China; 2.Department of Computer Science,Yangzhou University,Yangzhou Jiangsu 225009,China; 3.State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing Jiangsu 210093,China)

Abstract:	A frequent items mining algorithm of stream data (SW-COUNT) was proposed, which used data sampling technique to mine frequent items of data flow under sliding windows. Given an error threshold ε, SW COUNT can detect ε-approximate frequent items of a data stream using O(ε-1) memory space and the processing time for each data item was O(1). A lot of experiments show that SW-COUNT outperforms other methods in terms of the accuracy, memory requirement, and time and space efficiency.

Keywords:	data stream frequent item sliding window sampling technology data mining
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏