基于约束的混合属性增量聚类算法 Constraint-based incremental clustering algorithm with mixed attributes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于约束的混合属性增量聚类算法

引用本文：	苏晓珂,兰洋,程耀东,万仁霞. 基于约束的混合属性增量聚类算法[J]. 计算机工程与设计, 2010, 31(8)

作者姓名：	苏晓珂兰洋程耀东万仁霞

作者单位：	1. 东华大学,信息科学与技术学院,上海,201620 2. 信阳师范学院,计算机与信息技术学院,河南,信阳,464000 3. 中国科学院,高能物理研究所计算中心,北京,100049

基金项目：	国家高技术研究发展计划(863计划)，国家自然科学基金

摘要：	为解决大规模数据集聚类过程中内存容量受限问题,提出了一种基于聚类个数约束的快速聚类算法,只需扫描一趟原始数据集,半径阈值随聚类过程动态变化;同时定义了一种包含分类属性取值频率信息的类间差异性度量,可用于混合属性数据集,时间复杂度与空间复杂度同数据集大小,属性个数近似成线性关系.在KDDCUP99数据集上的实验结果表明,提出的算法输入参数少,具有良好的聚类特性,可用于大规模数据集.
关键词：	混合属性增量聚类差异度量大规模数据集约束
Constraint-based incremental clustering algorithm with mixed attributes

SU Xiao-ke,LAN Yang,CHENG Yao-dong,WAN Ren-xia. Constraint-based incremental clustering algorithm with mixed attributes[J]. Computer Engineering and Design, 2010, 31(8)

Authors:	SU Xiao-ke LAN Yang CHENG Yao-dong WAN Ren-xia

Affiliation:	SU Xiao-ke1,LAN Yang2,CHENG Yao-dong3,WAN Ren-xia1 (1. College of Information Science , Technology,Donghua University,Shanghai 201620,China,2. School of Computer , Information Technology,Xinyang Normal University,Xinyang 464000,3. Institute of High Energy Physics,Chinese Academy of Sciences,Beijing 100049,China)

Abstract:	To solve the constraint of the memory capacity during clustering the large-scale dataset, a fast clustering algorithm based on the constraint of the number of clusters is put forward. The original dataset is read only once and the radius threshold changes dynamically. At the same time an inter-cluster dissimilarity measure taking into account the frequency information of the categorical attribute values is introduced, which can be used for the mixed dataset. The time complexity and space complexity are near...

Keywords:	mixed attributes clustering incrementally dissimilarity meusure large-scaledatasct constraint
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏