基于k均值分区的数据流离群点检测算法 An Efficient Data Stream Outliers Detection Algorithm Based on k-Means Partitioning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于k均值分区的数据流离群点检测算法

引用本文：	倪巍伟,陆介平,陈耿,孙志挥.基于k均值分区的数据流离群点检测算法[J].计算机研究与发展,2006,43(9):1639-1643.

作者姓名：	倪巍伟陆介平陈耿孙志挥

作者单位：	东南大学计算机科学与工程学院,南京,210096

基金项目：	国家自然科学基金;高等学校博士学科点专项科研项目;江苏省自然科学基金

摘要：	离群知识发现是数据挖掘研究的一个重要方面，数据流离群点挖掘更因其挖掘对象具有动态性、不可复读性、数据量大等特点而成为离群知识发现研究的一个难点．提出一种基于k均值分区的流数据离群点发现算法，先对数据流进行分区做k均值聚类生成中间聚类结果（均值参考点集），随后在这些均值参考点中，根据离群点的定义找出可能存在的离群点．理论分析和实验结果表明，算法可以有效解决数据流离群点检测问题，算法是有效可行的．
关键词：	数据挖掘离群点检测均值参考点聚合
收稿时间：	11 13 2005 12:00AM
修稿时间：	2005-11-132006-04-25
An Efficient Data Stream Outliers Detection Algorithm Based on k-Means Partitioning

Ni Weiwei,Lu Jieping,Chen Geng,Sun Zhihui.An Efficient Data Stream Outliers Detection Algorithm Based on k-Means Partitioning[J].Journal of Computer Research and Development,2006,43(9):1639-1643.

Authors:	Ni Weiwei Lu Jieping Chen Geng Sun Zhihui

Affiliation:	College of Computer Science and Engineering, Southeast University, Nanjing 210096

Abstract:	Outliers detection is an important issue in data mining. It is difficult to find outliers in data streams because data streams are dynamic, one pass readable and of large amount of data. In this paper, a data stream outliers detection algorithm based on k-means partioning-DSOKP is proposed, which applies k means clustering on each partition of the data stream to generate mean reference point set, and subsequently picks out those potential outliers of each periods according to the definition of outliers. Theoretic analysis and experimental results indicate that DSOKP is effective and efficient.

Keywords:	data mining outliers detection mean reference point clustering
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏