首页 | 本学科首页   官方微博 | 高级检索  
     

基于动态网格的数据流离群点快速检测算法
引用本文:杨宜东,孙志挥,朱玉全,杨明,张柏礼. 基于动态网格的数据流离群点快速检测算法[J]. 软件学报, 2006, 17(8): 1796-1803
作者姓名:杨宜东  孙志挥  朱玉全  杨明  张柏礼
作者单位:东南大学,计算机科学与工程系,江苏,南京,210096;江苏大学,计算机科学与通信工程学院,江苏,镇江,212013;南京师范大学,计算机科学系,江苏,南京,210000
摘    要:离群点检测问题作为数据挖掘的一个重要任务,在众多领域中得到了应用.近年来,基于数据流数据的挖掘算法研究受到越来越多的重视.为了解决数据流数据中的离群点检测问题,提出了一种基于数据空间动态网格划分的快速数据流离群点检测算法.算法利用动态网格对空间中的稠密和稀疏区域进行划分,过滤处于稠密区域的大量主体数据,有效地减少了算法所需考察的数据对象的规模.而对于稀疏区域中的候选离群点,采用近似方法计算其离群度,具有高离群度的数据作为离群点输出.在保证一定精确度的条件下,算法的运行效率可以得到大幅度提高.对模拟数据集和真实数据集的实验检测均验证了该算法具有良好的适用性和有效性.

关 键 词:数据流  离群点检测  时间相关动态网格划分
收稿时间:2004-09-30
修稿时间:2005-10-11

A Fast Outlier Detection Algorithm for Data Streams Based on Dynamic Grids
YANG Yi-Dong,SUN Zhi-Hui,ZHU Yu-Quan,YANG Ming and ZHANG Bo-Li. A Fast Outlier Detection Algorithm for Data Streams Based on Dynamic Grids[J]. Journal of Software, 2006, 17(8): 1796-1803
Authors:YANG Yi-Dong  SUN Zhi-Hui  ZHU Yu-Quan  YANG Ming  ZHANG Bo-Li
Affiliation:1.Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China; 2.School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013, China; 3.Department of Computer Science, Nanjing Normal University, Nanjing 210000, China
Abstract:As an important task of data mining, outlier detection has been applied to many fields. Recently, research on mining in data stream is receiving more and more attention. For solving outlier detection in data stream, a new fast outlier detection algorithm is presented. Based on dynamically grid partitioning data space, the method separates dense areas from sparse areas. Data in dense areas are filtered simply, which reduces greatly the size of objects the algorithm should consider. Outliernesses of candidates in sparse areas are approximated efficiently. Data with high outlierness are outputted as outliers. Results of experiments on synthetic and real data sets show promising availabilities of the approaches.
Keywords:data stream  outlier detection  time-sensitive dynamic grids partitioning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号