首页 | 本学科首页   官方微博 | 高级检索  
     

基于层次聚类的离群点分析方法
引用本文:张俊溪,;杨海粟. 基于层次聚类的离群点分析方法[J]. 计算机技术与发展, 2014, 0(8): 80-83
作者姓名:张俊溪,  杨海粟
作者单位:[1]西安航空学院车辆与医电工程系,陕西西安710077; [2]西安电子工程研究所,陕西西安710100
基金项目:陕西省自然科学基金(2012JM8043)
摘    要:发现离群点并合理地解释离群点对数据挖掘结果的运用有重要意义,通过对离群点属性的检测可以发现其离群特性,进而更加准确地解释聚类结果。针对在聚类结果中出现的不同离群点及其特性,提出将层次聚类算法应用于离群点分析,通过元胞自动机距离变换算法实现凝固层次聚类,实现了簇间距离的度量;定义了演化周期上的平均度量距离,能够发现不同聚类层次上的离群点及其离群特性。该算法能够在得到聚类结果的同时,有效地解释离群点的属性,并具有较低的计算复杂度和并行计算以及向高维空间扩展的特性。通过试验数据进行了实证研究,验证了算法的有效性。

关 键 词:离群点  凝固层次聚类  元胞自动机    离群特性

Outlier Analysis Method Based on Hierarchical Clustering
Affiliation:ZHANG Jun-xi, YANG Hai-su (1.Department of Vehicle and Biomedial Engineering, Xi' an Aeronautical University, Xi' an 710077 ,China; 2. Xi' an Electronic Engineering Research Institute, Xi' an 710100, China)
Abstract:It is very important for data mining results to find the outliers and interpret them reasonably. It can find the characteristics from the group to detect the outliers attribute,interpreting the clustering results more accurately. In view of the various outliers and their characteristics in clustering results,hierarchical clustering algorithm is applied in outlier analysis,achieving the agglomerative clustering by the distance transform method based on Cellular Automata ( CA) ,which could measure the distance between clusters. The average metric is defined,which could discover outliers and their characteristics on each clustering result. The proposed algorithm can obtain clustering results,simultaneously effectively interpreting the properties of outliers,which is verified to be lower in computational complexity and has the characteristics of extending to high dimensional space and parallel computing. Simulation results of the sample are given to illustrate the effectiveness of the algorithm.
Keywords:outliers  agglomerative clustering  cellular automata  clusters  outlier characteristics
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号