首页 | 本学科首页   官方微博 | 高级检索  
     

基于网格技术的高维大数据集离群点挖掘算法
引用本文:曹洪其,孙志挥. 基于网格技术的高维大数据集离群点挖掘算法[J]. 计算机应用, 2007, 27(10): 2369-2371
作者姓名:曹洪其  孙志挥
作者单位:1. 南通职业大学,电子工程系,江苏,南通,226007
2. 东南大学,计算机科学与工程学院,南京,210096
摘    要:提出了一种基于网格技术的高维大数据集离群点挖掘算法(OMAGT)。该算法针对高维大数据集的分布特性,首先采用基于网格技术的方法寻找出聚类区域,并删除聚类区域内不可能成为离群点的聚类点集,然后运用局部离群因子(LOF)算法对剩下的点集进行离群点挖掘。OMAGT算法较好地实现了聚类信息的动态释放,将保留的离群点挖掘信息控制在一定的内存容量范围内,提高了算法的时间效率和空间效率。理论分析与实验结果表明OMAGT算法是可行和有效的。

关 键 词:数据挖掘  离群点  网格  聚类区域
文章编号:1001-9081(2007)10-2369-03
收稿时间:2007-04-23
修稿时间:2007-04-23

Algorithm of outliers mining based on grid techniques in high dimension large dataset
CAO Hong-qi,SUN Zhi-hui. Algorithm of outliers mining based on grid techniques in high dimension large dataset[J]. Journal of Computer Applications, 2007, 27(10): 2369-2371
Authors:CAO Hong-qi  SUN Zhi-hui
Abstract:An algorithm of outliers mining based on grid techniques in high dimension large dataset called Outliers Mining Algorithm based on Grid Techniques (OMAGT) was proposed. Focusing on the distributing characteristics of high dimension large dataset, clustering regions were found out by using the way based on grid techniques, moreover, those clustering dataset unable to turn into outliers in clustering region were deleted. Then outliers mining was done using algorithm Local Outlier Factor (LOF) in the remaining datasets. In the algorithm OMAGT, dynamical release of clustering information was preferably carried out. Thus, information of reserved outliers mining was restricted in limited memory capacitance, so both time efficiency and space efficiency were improved. Results in both theory analyses and experiments show that this algorithm is feasible and efficient.
Keywords:data mining  outliers  grid  clustering region
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号