首页 | 本学科首页   官方微博 | 高级检索  
     

大规模分布数据的分阶段非线性聚类方法应用研究
引用本文:丘威.大规模分布数据的分阶段非线性聚类方法应用研究[J].数字社区&智能家居,2013(12):7767-7769.
作者姓名:丘威
作者单位:嘉应学院计算机学院,广东梅州514015
基金项目:广东省自然科学基金项目(No.S2013010013307)的资助
摘    要:提出一种能够有效处理大规模分布的数据聚类问题且简化计算复杂度的分阶段非线性聚类方法,该算法包含两个阶段:首先将数据划分为若干个球形分布的子类,采用K近邻图理论对原始数据计算顶点能量并提取顶点攻能量样本;再采用K近邻算法对该高能量样本做一个划分,从而得到一个考虑高能量样本的粗划分同时估计出聚类的个数,最后,综合两次聚类结果整理得到最终聚类结果。该方法的主要优点是可以用来处理复杂聚类问题,算法较为稳定,并且在保持聚类正确率的同时,降低了大规模分布数据为相似性度量的计算代价。

关 键 词:流数据  数据挖掘  聚类  非线性

Research on Large Scale Distribution Data Method of Nonlinear Clustering
QIU Wei.Research on Large Scale Distribution Data Method of Nonlinear Clustering[J].Digital Community & Smart Home,2013(12):7767-7769.
Authors:QIU Wei
Affiliation:QIU Wei (School of Computer Science, Jiaying University, Meizhou 514015, China)
Abstract:This paper propose a way to efficiently handle large-scale distributed data clustering problems and simplifies the com-putational complexity of nonlinear phased clustering method, this algorithm consists of two phases:First, the data is divided into several sub-categories of spherical distribution, using K neighbor graph theory to calculate the energy of the original data and ex-tract the vertex vertices attack energy sample;then using K-nearest neighbor algorithm to do a sample of the high-energy divi-sion, resulting in a high-energy samples considered coarse division while the estimated number of clusters, and finally comprehen-sive results of the two clustering clustering results to get the final finishing. The main advantage of this method can be used to deal with complex clustering algorithm is more stable, and while maintaining the accuracy of clustering to reduce the computa-tional cost of large-scale distribution of the similarity measure data.
Keywords:manifold data  data mining  clustering  nonlinear
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号