首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于邻域系统密度差异度量的离群点检测算法
引用本文:杜旭升,于炯,陈嘉颖,王跃飞,蒲勇霖,叶乐乐.一种基于邻域系统密度差异度量的离群点检测算法[J].计算机应用研究,2020,37(7):1969-1973.
作者姓名:杜旭升  于炯  陈嘉颖  王跃飞  蒲勇霖  叶乐乐
作者单位:新疆大学 软件学院,乌鲁木齐 830008;新疆大学 软件学院,乌鲁木齐 830008;新疆大学 信息科学与工程学院,乌鲁木齐 830008;新疆大学 信息科学与工程学院,乌鲁木齐 830008;西安交通大学 软件学院,西安 710049
摘    要:针对离群点检测算法LOF在高维离散分布数据集中检测精度较低及参数敏感性较高的问题,提出了基于邻域系统密度差异度量的离群点检测NSD(neighborhood system density difference)算法。相较于传统基于密度的离群点检测方法,NSD算法引入了截取距离的概念。首先计算数据集中对象在截取距离内的邻居点个数;其次计算对象的邻域系统密度;然后将对象的密度与它邻居的密度进行比较,判定目标对象与其邻居趋向于同一簇的程度;最后输出最可能是离群点的对象。将NSD算法与LOF、LDOF、CBOF算法在真实数据集与合成数据集中对比实验发现,NSD算法具有较高的检测准确率和执行效率以及较低的参数敏感性,证明了NSD算法是有效可行的。

关 键 词:数据挖掘  离群点检测  基于密度  LOF  LDOF  CBOF
收稿时间:2018/12/20 0:00:00
修稿时间:2020/6/2 0:00:00

Outlier detection algorithm based on neighborhood system density difference measurement
duxusheng,yujiong,chenjiaying,wangyuefei,puyonglin and yelele.Outlier detection algorithm based on neighborhood system density difference measurement[J].Application Research of Computers,2020,37(7):1969-1973.
Authors:duxusheng  yujiong  chenjiaying  wangyuefei  puyonglin and yelele
Abstract:LOF is a famous algorithm for outlier detection, and it has lower detection accuracy and higher parameter sensitivity on high-dimensional discrete distribution datasets. Aiming at these problems, this paper proposed a neighborhood system density difference(NSD) algorithm based on density difference measurement of neighborhood systems. Compared with the traditional density-based methods, NSD algorithm proposed and introduced the concept of intercept distance. Firstly, it calculated the number of neighbors of an object within the intercept distance on dataset. Then, it computed the density of neighborhood system. After that, it estimated the degree of tending to the same cluster by comparing the density between the object and its neighbors. Finally, it output the objects which closed to outlier with maximum likelihood. Experiments with NSD, LOF, LDOF, CBOF algorithms carried out on the real-world dataset and synthetic dataset, show that the NSD algorithm performs with higher detection accuracy and execution efficiency, while with lower parameter sensitivity.
Keywords:data mining  outlier detection  density-based  LOF(local outlier factor)  LDOF(local distance-based outlier factor)  CBOF(cohesiveness-based outlier factor)
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号