首页 | 本学科首页   官方微博 | 高级检索  
     

信息熵度量的离群数据挖掘算法
引用本文:张贺,蔡江辉,张继福,乔衎.信息熵度量的离群数据挖掘算法[J].智能系统学报,2010,5(2):150-155.
作者姓名:张贺  蔡江辉  张继福  乔衎
作者单位:1. 太原科技大学,计算机科学与技术学院,山西,太原,030024
2. 北京航空航天大学,自动化科学与电气工程学院,北京,100191
摘    要:离群数据挖掘是为了找出隐含在海量数据中相对稀疏而孤立的异常数据模式,但传统的离群数据挖掘方法受人为因素影响较大.通过引入基于信息熵的离群度量因子,给出一种离群数据挖掘新算法.该算法先利用信息熵计算每个数据对象的离群度量因子,然后通过离群度量因子来衡量每个对象的离群程度,进而检测离群数据,有效地消除了人为主观因素对离群检测的影响,并能很好地解释离群点的含义.最后,采用UCI和恒星光谱数据作为实验数据,通过对实验的分析,验证了该算法的可行性和有效性.

关 键 词:离群数据  信息熵  离群度量因子  数据挖掘

An outlier mining algorithm based on information entropy
ZHANG He,CAI Jiang-hui,ZHANG Ji-fu,QIAO Kan.An outlier mining algorithm based on information entropy[J].CAAL Transactions on Intelligent Systems,2010,5(2):150-155.
Authors:ZHANG He  CAI Jiang-hui  ZHANG Ji-fu  QIAO Kan
Affiliation:ZHANG He1,CAI Jiang-hui1,ZHANG Ji-fu1,QIAO Kan2 (1.School of Computer Science , Technology,Taiyuan University of Science & Technology,Taiyuan 030024,China,2.Automation Science , Electrical Engineering College,Beijing University of Aeronautics , Astronautics,Beijing 100191,China)
Abstract:The task of outlier mining is to discover patterns that are exceptional,interesting,and sparse or isolated even though they are concealed within tremendous volumes of data.Traditional outlier detection methods are easily influenced by man-made factors.A novel outlier mining algorithm based on information entropy has been formulated.It used an outlier measurement factor based on information entropy.In the algorithm,the outlier measurement factor of each record was calculated using information entropy.Outlier...
Keywords:outlier  information entropy  outlier measure factor  data mining  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号