首页 | 本学科首页   官方微博 | 高级检索  
     

分类属性高维数据基于集合差异度的聚类算法
引用本文:武森, 魏桂英, 白尘, 张桂琼. 分类属性高维数据基于集合差异度的聚类算法[J]. 工程科学学报, 2010, 32(8): 1085-1089. DOI: 10.13374/j.issn1001-053x.2010.08.045
作者姓名:武森  魏桂英  白尘  张桂琼
作者单位:1.北京科技大学经济管理学院, 北京 100083
摘    要:提出基于集合差异度的聚类算法.算法通过定义的集合差异度和集合精简表示,直接进行一个集合内所有对象总体差异程度的计算,而不必计算两两对象间的距离,并且在不影响计算精确度的情况下对分类属性高维数据进行高度压缩,只需一次数据扫描即得到聚类结果.算法计算时间复杂度接近线性.实例表明该算法是有效的.

关 键 词:聚类  高维空间  集合  差异度  数据挖掘
收稿时间:2010-01-18

Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes
WU Sen, WEI Gui-ying, BAI Chen, ZHANG Gui-qiong. Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes[J]. Chinese Journal of Engineering, 2010, 32(8): 1085-1089. DOI: 10.13374/j.issn1001-053x.2010.08.045
Authors:WU Sen  WEI Gui-ying  BAI Chen  ZHANG Gui-qiong
Affiliation:1.School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China
Abstract:A clustering algorithm is proposed based on set dissimilarity. Through defining set dissimilarity and set reduction, it does not calculate the distance between each pair of objects but computes the general dissimilarity of all the objects in a set directly, reduces high-dimensional categorical data enormously without loss of computation accuracy and gets the clustering result by only once data scanning. The time complexity of the algorithm is almost linear. An example of real data shows that the clustering algorithm is effective.
Keywords:clustering  high-dimensional space  sets  dissimilarity  data mining
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《工程科学学报》浏览原始摘要信息
点击此处可从《工程科学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号