首页 | 本学科首页   官方微博 | 高级检索  
     

基于网格相邻关系的离异点识别算法
引用本文:李光兴,杨燕.基于网格相邻关系的离异点识别算法[J].计算机工程与科学,2010,32(9):130-133.
作者姓名:李光兴  杨燕
作者单位:1. 成都农业科技职业学院基础部,四川,成都,611130;西南交通大学信息科学与技术学院,四川,成都,610031
2. 西南交通大学信息科学与技术学院,四川,成都,610031
摘    要:离异点是偏离部分观察对象的数据点,根据离异点所在单元的密度与相邻单元的密度相比可能偏高或偏低的特点,本文提出了基于网格相邻关系的离异点识别算法GAO。该算法用单元间的相对密度和单元质心距离来衡量单元间的离异度,根据离异度确定离异单元和离异点。实验结果表明,该算法能有效地识别出多密度数据集的离异点,算法的效率优于Cell-based算法,且适合大数据集的离异点识别。

关 键 词:相邻单元  相异函数  离异点
收稿时间:2010-03-13
修稿时间:2010-06-10

An Outlier Recognition Algorithm Based on Grid Adjacency Relation
LI Guang-xing,YANG Yan.An Outlier Recognition Algorithm Based on Grid Adjacency Relation[J].Computer Engineering & Science,2010,32(9):130-133.
Authors:LI Guang-xing  YANG Yan
Affiliation:(1.Department of Fundamental Courses,Chengdu Vocational College of Agricultural Science and Technology,Chengdu 611130;2.School of Information Science and Technology,Southwest Jiaotong University,Chengdu 610031,China)
Abstract:Outliers are the deviation objects of data points. The paper presents an outlier recognition algorithm based on grid adjacency relation (GAO), according to the high or low density of the outlier unit comparing to its neighborhood. The outlier and the outlier unit are determined by the degree of deviation, which is measured by the relative density and distance of the center of mass between units. The experimental results show that the algorithm can recognize the outlier of multi density, high dimensional and large data sets effectively. The algorithm’s efficiency is better than that of the Cell based algorithms.
Keywords:adjacent units  diversity function  outlier
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号