首页 | 本学科首页   官方微博 | 高级检索  
     

混合属性数据集的基于近邻连接的两阶段聚类算法
引用本文:陈新泉. 混合属性数据集的基于近邻连接的两阶段聚类算法[J]. 计算机工程与科学, 2012, 34(9): 135-142
作者姓名:陈新泉
作者单位:重庆三峡学院计算机科学与工程学院,重庆,404000
基金项目:重庆三峡学院科学研究项目计划资助项目
摘    要:面对混合属性数据集的数据预处理需求,本文在给出若干定义及相关性质之后,提出了一种基于近邻连接的两阶段聚类算法。为提高算法的时间效率,给出了算法改进的思路与技术。多个人工数据集和UCI标准数据集的仿真实验结果表明,对于一些具有明显聚类分布结构的数据集,该算法经常能取得比k-means算法和AP算法更好的聚类精度,说明它具有一定的有效性。为进一步推广并在实际中发掘出该算法的应用价值,最后给出了几点研究展望。

关 键 词:混合属性  聚类特征  初级聚类  近邻图

A Two-Phase Clustering Algorithm Based on Near Neighbor Connection for Mixed Data Set
CHEN Xin-quan. A Two-Phase Clustering Algorithm Based on Near Neighbor Connection for Mixed Data Set[J]. Computer Engineering & Science, 2012, 34(9): 135-142
Authors:CHEN Xin-quan
Affiliation:CHEN Xin-quan (School of Computer Science and Engineering,Chongqing Three Gorges University,Chongqing 404000,China)
Abstract:In order to effectively preprocess some mixed data sets,this paper first gives some definitions and related properties,then presents a two-phase clustering algorithm based on near neighbor connection.To improve the time efficiency of this algorithm,some improving ideas and techniques are described.Through the simulation experiments of some artificial data sets and UCI standard data sets,we can verify that this clustering algorithm can often obtain better clustering quality than the k-means algorithm and the AP algorithm when facing to some data sets with apparent clusters.So we can say that this clustering algorithm has certain value. In the end,several research expectations are given to disinter and popularize this method.
Keywords:mixed attributes  cluster feature  primary cluster  near neighbor graph
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号