首页 | 本学科首页   官方微博 | 高级检索  
     

基于全息熵的空间离群点挖掘算法研究
引用本文:薛安荣,何 峰,闻丹丹.基于全息熵的空间离群点挖掘算法研究[J].计算机应用研究,2014,31(2):369-372.
作者姓名:薛安荣  何 峰  闻丹丹
作者单位:江苏大学 计算机科学与通信工程学院, 江苏 镇江 212013
基金项目:国家自然科学基金资助项目(61300228); 高校博士点基金资助项目(20093227110005)
摘    要:基于距离和基于密度的离群点检测算法受到维度和数据量伸缩性的挑战, 而空间数据的自相关性和异质性决定了以属性相互独立和分类属性的基于信息理论的离群点检测算法也难以适应空间离群点检测, 因此提出了基于全息熵的混合属性空间离群点检测算法。算法利用区域标志属性进行区域划分, 在区域内利用空间关系确定空间邻域, 并用R*-树进行检索。在此基础上提出了基于全息熵的空间离群度的度量方法和空间离群点挖掘算法, 有效解决了混合属性的离群度的度量和离群点的挖掘问题。由于实现区域划分有利于并行计算, 从而可适应大数据量的计算。理论和实验证明, 所提算法在计算效率和实验结果的可解释性方面均具有优势。

关 键 词:全息熵  R*-树  空间离群点  离群点检测  混合属性

Spatial outlier detection based on holographic entropy
XUE An-rong,HE Feng,WEN Dan-dan.Spatial outlier detection based on holographic entropy[J].Application Research of Computers,2014,31(2):369-372.
Authors:XUE An-rong  HE Feng  WEN Dan-dan
Affiliation:School of Computer Science & Communication Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China
Abstract:The outlier detection algorithms based on distance and density are faced with the challenges of both the dimensions and the amount of data scalability, and the autocorrelation and heterogeneity of spatial data determines that outlier detection algorithm which is characterized by attribute independent of each other and categorical attributes based on information theory is difficult to adapt to the spatial outlier detection. Hence, this paper proposed a spatial outlier detection algorithm based on mixed attributes of holographic entropy. The algorithm partitioned the region by regional identity property, determined the spatial neighborhood using spatial relationships in the region and then retrieved it by R*-tree. On this basis, it proposed spatial outlier degree based on holographic entropy and spatial outlier mining algorithm; it solved the outlier degree of the mixed attributes and the problems of outliers mining effectively. It could adapt to the large volume of data calculation because partitioning the region was conducive to parallel computing. Theoretical and experimental results show that the algorithm proposed has advantage in terms of the computational efficiency and the interpretative aspects.
Keywords:holographic entropy  R*-tree  spatial outlier  outlier detection  mixed attributes
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号