首页 | 本学科首页   官方微博 | 高级检索  
     

基于近邻传播的离群点检测算法
引用本文:张倩倩,于炯,李梓杨,蒲勇霖.基于近邻传播的离群点检测算法[J].计算机应用研究,2021,38(6):1662-1667.
作者姓名:张倩倩  于炯  李梓杨  蒲勇霖
作者单位:新疆大学 软件学院,乌鲁木齐830091;新疆大学 软件学院,乌鲁木齐830091;新疆大学 信息科学与工程学院,乌鲁木齐830046;新疆大学 信息科学与工程学院,乌鲁木齐830046
基金项目:国家自然科学基金资助项目(61862060,61462079,61562086,61562078)
摘    要:离群点是与其他正常点属性不同的一类对象,其检测技术在各行业上均有维护数据纯度、保障业内安全等重要应用,现有算法大多是基于距离、密度等传统方法判断检测离群点.本算法给每个对象分配一个"孤立度",即该点相对其邻点的孤立程度,通过排序进行判定,比传统算法效率更高.在AP(affinity propagation)聚类算法的基础上进行改进与优化,提出能检测异常数据点的算法APO(outlier detection algorithm based on affinity propagation).通过加入孤立度模块并计算处理样本点的孤立信息,并引入放大因子,使其与正常点之间的差异更明显,通过增大算法对离群点的敏感性,提高算法的准确性.分别在模拟数据集和真实数据集上进行对比实验,结果表明:该算法与AP算法相比,对离群点的敏感性更加强烈,且本算法检测离群点的同时也能聚类,是其他检测算法所不具备的.

关 键 词:离群点检测  聚类算法  数据挖掘  近邻传播
收稿时间:2020/8/25 0:00:00
修稿时间:2021/5/9 0:00:00

Outlier detection algorithm based on affinity propagation
Zhang Qianqian,Yu Jiong,Li Ziyang and Pu Yonglin.Outlier detection algorithm based on affinity propagation[J].Application Research of Computers,2021,38(6):1662-1667.
Authors:Zhang Qianqian  Yu Jiong  Li Ziyang and Pu Yonglin
Affiliation:Xinjiang University,Software College,Xinjiang Urumqi,,,
Abstract:Outliers are a class of objects with different properties from other normal points, whose detection technology in various industries has a wide application to maintain the purity of data and ensure the safety of the industry. Most of the existing algorithms are based on distance, density, and other traditional methods to detect outliers. This paper assigned each object an "isolation degree", the degree of isolation of the point relative to adjacent points, which could identify outliers by sorting, that''s more efficient. It proposed the detection technology APO by improving and optimizing the AP clustering algorithm. It introduced the outlier module and processed the isolated information of points. In addition, it added the amplification factor to enlarge the isolation of outliers, so that the difference between the outliers and the normal points is more obvious, the sensitivity and the accuracy of the algorithm is much higher. The experiment used simulated dataset real datasets, who''s the results showed that the algorithm was more sensitive and it detected outliers more accurately than AP algorithm. In addition, this algorithm can cluster outliers while detecting outliers, which is not available in other detection algorithms.
Keywords:outlier detection  clustering algorithm  data mining  affinity propagation
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号