分类属性高维数据基于集合差异度的聚类算法 Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

分类属性高维数据基于集合差异度的聚类算法

引用本文：	武森, 魏桂英, 白尘, 张桂琼. 分类属性高维数据基于集合差异度的聚类算法[J]. 工程科学学报, 2010, 32(8): 1085-1089. DOI: 10.13374/j.issn1001-053x.2010.08.045

作者姓名：	武森魏桂英白尘张桂琼

作者单位：	1.北京科技大学经济管理学院, 北京 100083

摘要：	提出基于集合差异度的聚类算法.算法通过定义的集合差异度和集合精简表示,直接进行一个集合内所有对象总体差异程度的计算,而不必计算两两对象间的距离,并且在不影响计算精确度的情况下对分类属性高维数据进行高度压缩,只需一次数据扫描即得到聚类结果.算法计算时间复杂度接近线性.实例表明该算法是有效的.
关键词：	聚类高维空间集合差异度数据挖掘
收稿时间：	2010-01-18
Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes

WU Sen, WEI Gui-ying, BAI Chen, ZHANG Gui-qiong. Clustering algorithm based on set dissimilarity for high dimensional data of categorical attributes[J]. Chinese Journal of Engineering, 2010, 32(8): 1085-1089. DOI: 10.13374/j.issn1001-053x.2010.08.045

Authors:	WU Sen WEI Gui-ying BAI Chen ZHANG Gui-qiong

Affiliation:	1.School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China

Abstract:	A clustering algorithm is proposed based on set dissimilarity. Through defining set dissimilarity and set reduction, it does not calculate the distance between each pair of objects but computes the general dissimilarity of all the objects in a set directly, reduces high-dimensional categorical data enormously without loss of computation accuracy and gets the clustering result by only once data scanning. The time complexity of the algorithm is almost linear. An example of real data shows that the clustering algorithm is effective.

Keywords:	clustering high-dimensional space sets dissimilarity data mining
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《工程科学学报》浏览原始摘要信息
	点击此处可从《工程科学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏