首页 | 本学科首页   官方微博 | 高级检索  
     

基于熵距离的离群点检测及其应用
引用本文:孙爱程.基于熵距离的离群点检测及其应用[J].无线电工程,2012,42(6):45-47,51.
作者姓名:孙爱程
作者单位:武汉理工大学信息工程学院,湖北武汉,430070
摘    要:离群数据检测是找出与正常数据不一致的数据。由于某种原因,会出现一些噪声数据。针对噪声数据的特征,提出了一个有效的离群点检测算法。通过层次k-means算法对数据集进行聚类,从包括离群点可能性最大的簇开始进行检测,在检测过程中提出基于熵值距离来衡量数据点的离群程度,并通过剪枝规则来减少检测次数,从而提高了检测的效率。仿真结果表明该算法对出现的噪声数据具有较好的过滤效果。

关 键 词:离群点  聚类  剪枝规则  熵距离

Entropy Distance-based Outlier Detection and Its Application
SUN Ai-cheng.Entropy Distance-based Outlier Detection and Its Application[J].Radio Engineering of China,2012,42(6):45-47,51.
Authors:SUN Ai-cheng
Affiliation:SUN Ai-cheng ( School of Information Engineering, Wuhan University of Technology, Wuhan Hubei 430070, China)
Abstract:The outlier detection is to identify the inconsistent data different to the normal data. For some reason, there will be some noise data in the student ratings of teaching evaluation. Based on the characteristics of noise data in teaching evaluation, this paper proposes an effective entropy distance-based outlier detection algorithm. This algorithm uses k-means algorithm to implement data clustering and then detects from the cluster with the highest possibility of including outlier. During the detection, the degree of outlier data is measured based on entropy distance. The simulation results show that this algorithm can have better filtering effect for noise data in student ratings of teaching effectiveness evaluation.
Keywords:outlier  cluster  pruning rules  entropy distance
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号