首页 | 本学科首页   官方微博 | 高级检索  
     

基于粗糙集的混合属性数据聚类算法
引用本文:范黎林,王娟. 基于粗糙集的混合属性数据聚类算法[J]. 计算机应用, 2010, 30(12): 3377-3379
作者姓名:范黎林  王娟
作者单位:1. 河南师范大学2.
摘    要:传统聚类方法将对象严格地划分到某一类,但是很多时候边界对象不能被严格地划分。基于粗糙集的k-means聚类算法和基于粗糙集的leader聚类算法,利用粗糙集理论将数据对象划分到一个簇的上近似集或下近似集当中,提供了一种新的处理不确定性的视角,很好地解决了这种边界不确定问题。但其缺点是不能处理混合属性数据,聚类结果对初值有明显的依赖性。针对这些算法存在的不足,给出了一种适用于混合属性数据的距离定义,对初始值的选取提出了改进办法,提出了一种基于粗糙集的混合属性数据聚类算法。仿真实验证明,在不确定聚类簇数的情况下,该算法的聚类准确率比传统k-means算法明显提高。

关 键 词:聚类  粗糙集  k-means算法  混合属性  
收稿时间:2010-06-24
修稿时间:2010-07-17

Clustering algorithms for mixed attributes based on rough set
FAN Li-lin,WANG Juan. Clustering algorithms for mixed attributes based on rough set[J]. Journal of Computer Applications, 2010, 30(12): 3377-3379
Authors:FAN Li-lin  WANG Juan
Abstract:Objects are strictly divided into clusters in the conventional algorithms; however, most of the time, the object boundary cannot be strictly classified. The rough set based k-means clustering algorithm and leader clustering algorithm divide the data object into a clusters upper-bound or lower-bound using rough set, which provides a new perspective of dealing with uncertainty and solve the problem of uncertain boundary region. The problem is that both of the two algorithms cannot deal with mixed valued data, and clustering results significantly depend on the initial value. A definition of the distance for mixed valued data was introduced in this paper, an improved method was put forward for the selection of the initial value, and a clustering algorithm for mixed valued data based on rough set was given. Finally, a simulation experiment was carried out. Simulation results show, under the uncertain situation of cluster number,the clustering accuracy of the algorithm is significantly improved than the traditional k-means algorithm.
Keywords:clustering   rough set   k-means algorithm   mixed attribute
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号