首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于距离的再聚类的离群数据发现算法
引用本文:徐雪松,刘凤玉. 一种基于距离的再聚类的离群数据发现算法[J]. 计算机应用, 2006, 26(10): 2398-2400
作者姓名:徐雪松  刘凤玉
作者单位:南京理工大学,计算机科学与技术学院,江苏,南京,210094;南京理工大学,计算机科学与技术学院,江苏,南京,210094
摘    要:通过研究基于离群距离的数据发现(Cell-Based)算法的识别、分析和评价算法,指出了其优越性和不足,提出一种新的离群数据发现算法——基于距离的再聚类离群数据发现算法。理论分析和仿真结果表明,该算法有效地克服了传统的基于距离的数据发现算法易于随参数变化而需要调整单元结构,以及只适用于维度不高的离群数据发现等的缺点,并有效地避免了由于随机初始值选取导致不同的离群数据发现结果问题,同时也有较快的收敛速度。

关 键 词:聚类  距离  离群数据
文章编号:1001-9081(2006)10-2398-03
收稿时间:2006-04-24
修稿时间:2006-04-242006-06-23

Algorithm of finding Outlier for reclustering based on distance
XU Xue-song,LIU Feng-yu. Algorithm of finding Outlier for reclustering based on distance[J]. Journal of Computer Applications, 2006, 26(10): 2398-2400
Authors:XU Xue-song  LIU Feng-yu
Affiliation:Department of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing Jiangsu 210094, China
Abstract:The identifying, analyzing and evaluating algorithm of finding distance-based Outlier (Cell-Based) was firstly studied, and its advantages and disadvantages were pointed out. And then, a new Outlier finding algorithm-algorithm of finding Outlier for reclustering based on distance was proposed. Theoretical analysis and experimental results show that this algorithm can not only effectively overcome the faults of traditional Cell-Based algorithm, i.e. need to be recomputed from scratch for every change of the parameters, and only suitable for finding the Outlier of low dimension, but also obviously avoid the problems caused by randomly selecting initial value to produce different finding results of Outlier at higher convergence speed.
Keywords:cluster   distance   Outlier
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号