首页 | 本学科首页   官方微博 | 高级检索  
     

TLWCC:一种双层子空间加权协同聚类算法
引用本文:肖龙飞,陈小军.TLWCC:一种双层子空间加权协同聚类算法[J].集成技术,2013,2(1):16-22.
作者姓名:肖龙飞  陈小军
作者单位:深圳市高性能数据挖掘重点实验室;中国科学院深圳先进技术研究院
摘    要:协同聚类是对数据矩阵的行和列两个方向同时进行聚类的一类算法。本文将双层加权的思想引入协同聚类,提出了一种双层子空间加权协同聚类算法(TLWCC)。TLWCC对聚类块(co-cluster)加一层权重,对行和列再加一层权重,并且算法在迭代过程中自动计算块、行和列这三组权重。TLWCC考虑不同的块、行和列与相应块、行和列中心的距离,距离越大,认为其噪声越强,就给予小权重;反之噪声越弱,给予大权重。通过给噪声信息小权重,TLWCC能有效地降低噪声信息带来的干扰,提高聚类效果。本文通过四组实验展示TLWCC算法识别噪声信息的能力、参数选取对算法聚类结果的影响程度,算法的聚类性能和时间性能。

关 键 词:协同聚类  加权  聚类  数据挖掘

TLWCC: A Two-Level Subspace Weighting Co-clustering Algorithm
Authors:Xiao Feilong  Chen Xiao Jun
Affiliation:1,2 1 ( Shenzhen Key Laboratory of High Performance Data Mining, Shenzhen 518055, China ) 2 ( Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China )
Abstract:Co-clustering algorithms cluster a data matrix into row clusters and column clusters simultaneously. In this paper, we propose TLWCC, a two-level subspace weighting co-clustering algorithm, and introduces the idea of a two-level subspace weighting method into the co-clustering process. TLWCC adds the first level of weights on co-clusters, and then adds the second level of weights on rows and columns. The three types of weights (co-cluster, row and column weights) are computed in the clustering progress, according to the distances between co-clusters (or rows, columns) and their centers. The larger the distance is, the stronger noise it implies, so a smaller weight is given and vice verse. Thus, by giving small weights to noise, TLWCC filters out the noise and improves the co-clustering result. We propose an iterative algorithm to optimize the model. We carried out four experiments to learn more about TLWCC. The first experiment investigated the properties of three types of weights. The second experiment studied how the clustering result was influenced by the parameters. The third experiment compared the clustering performance of TLWCC with other three algorithms. The fourth experiment examined the computational efficiency of our proposed algorithm.
Keywords:co-clustering  weighting  clustering  data mining
本文献已被 CNKI 等数据库收录!
点击此处可从《集成技术》浏览原始摘要信息
点击此处可从《集成技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号