首页 | 本学科首页   官方微博 | 高级检索  
     

一种新的基于密度的聚类与孤立点检测算法
引用本文:刘大任,孙焕良,牛志成,朱叶丽.一种新的基于密度的聚类与孤立点检测算法[J].沈阳建筑工程学院学报(自然科学版),2006,22(1):149-153.
作者姓名:刘大任  孙焕良  牛志成  朱叶丽
作者单位:[1]沈阳建筑大学学报编辑部,辽宁沈阳110168 [2]沈阳建筑大学理学院,辽宁沈阳110168 [3]沈阳建筑大学计算中心,辽宁沈阳110168 [4]沈阳建筑大学信息与控制工程学院,辽宁沈阳110168
基金项目:辽宁省自然科学基金;辽宁省教育厅资助项目
摘    要:目的 提出一种聚类分析的新算法,解决聚类和同时检测孤立点的问题.方法 结合SNN算法和LOF算法给出新算法-SNN_LOF算法原理:(1)建立相似度矩阵;(2)去除噪声;(3)密度;(4)标记核心点;(5)计算每个数据点的lrd值;(6)由核心对象出发来形成一个聚;(7)取出被作为噪声的数据点;(8)计算被定义为噪声数据的LOF值,输出被视为孤立点的数据点.编制算法程序实现聚类和孤立点检测.结果 用CURE数据集,DBSCAN聚类算法和SNN聚类算法结果相同,时间消耗是很接近的.但当数据上升到10000以上时,SNN_LOF算法聚类的效率明显要高于DBSCAN算法,同时也检测到了孤立点.结论 SNN_LOF算法可以在聚类的同时发现孤立点.在大数据量时,SNN_LOF算法的聚类时间效率明显要高于DBSCAN算法.

关 键 词:聚类  孤立点  SNN算法  LOF检测算法
文章编号:1671-2021(2006)01-0149-05
修稿时间:2005年10月16

A New Density-based Clustering and Examination Algorithm on the Isolated Point
LIU Da-ren,SUN Huan-liang,NIU Zhi-cheng,ZHU Ye-li.A New Density-based Clustering and Examination Algorithm on the Isolated Point[J].Journal of Shenyang Archit Civil Eng Univ: Nat Sci,2006,22(1):149-153.
Authors:LIU Da-ren  SUN Huan-liang  NIU Zhi-cheng  ZHU Ye-li
Abstract:A new algorithm on clustering analysis is put forward,by which the problem of clustering and the examination of the isolated point at the same time can be solved.The following new fundamentals of algorithm have been concluded by combining SNN algorithm with LOF algorithm:(1)to set up a similarity-matrix;(2) to eliminate noise;(3)density:(4) to mark the central point:(5)to calculate the lrd-value of each datum point;(6)to form a clustering,revolving around the central target;(7)to pick out the datum point serving as noise;(8)to calculate LOF-value defined as the noise datum,and output the datum point regarded as the isolated point.Algorithm is programmed to realize clustering and examination of the isolated point.The same result has been reached and almost the same amount of time has been consumed by using DATA data-collection,DBSCAN clustering algorithm and SNN algorithm respectively.In CURE data collection,when the data reach over 10?000,the clustering through SNN_LOF algorithm has higher efficiency than that through DBSCAN algorithm and meanwhile the isolated point is examined.In conclusion,SNN_LOF algorithm can achieve clustering and at the same time find out the isolated point.Provided with huge data,SNN_LOF algorithm consumes obviously less time than DBSCAN algorithm.
Keywords:clustering  isolated point  SNN algorithm  LOF examination algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号