首页 | 本学科首页   官方微博 | 高级检索  
     

聚类集成中的差异性度量研究
引用本文:罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324.
作者姓名:罗会兰  孔繁胜  李一啸
作者单位:1. 浙江大学人工智能研究所,杭州,310027;江西理工大学信息工程学院,江西,赣州,341000
2. 浙江大学人工智能研究所,杭州,310027
摘    要:集体的差异性被认为是影响集成学习的一个关键因素.在分类器集成中有许多的差异性度量被提出,但是在聚类集成中如何测量聚类集体的差异性,目前研究得很少.作者研究了7种聚类集体差异性度量方法,并通过实验研究了这7种度量在不同的平均成员聚类准确度、不同的集体大小和不同的数据分布情况下与各种聚类集成算法性能之间的关系.实验表明:这些差异性度量与聚类集成性能间并没有单调关系,但是在平均成员准确度较高、聚类集体大小适中和数据中有均匀簇分布的情况下,它们与集成性能间的相关度还是比较高的.最后给出了一些差异性度量用于指导聚类集体生成的可行性建议.

关 键 词:集成学习  聚类集成  差异性  度量  聚类集成  差异性  度量方法  作者研究  Clustering  Measures  Diversity  可行性建议  指导  比较  相关度  分布情况  均匀  大小适  集成性能  关系  算法性能  数据  实验  测量
修稿时间:2007-03-01

An Analysis of Diversity Measures in Clustering Ensembles
LUO Hui-Lan,KONG Fan-Sheng,LI Yi-Xiao.An Analysis of Diversity Measures in Clustering Ensembles[J].Chinese Journal of Computers,2007,30(8):1315-1324.
Authors:LUO Hui-Lan  KONG Fan-Sheng  LI Yi-Xiao
Affiliation:1.Institute of Artificial Intelligence, Zhejiang University, Hangzhou 310027;2.School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000
Abstract:The diversity of an ensemble is known to be an important factor in determining its performance. There are a number of ways to quantify diversity in ensembles of classifiers, while little research has been done in clustering ensembles. This paper compares seven diversity measures of clustering ensembles with regard to their possible use in ensemble design. Five experiments have been designed to examine the relationships between the accuracy of the clustering ensembles and the measures of diversity under conditions of difference ensemble methods, different ensemble size and different data distributions respectively. Experiments show the relationships between these diversity measures and ensemble performances are not monotonous. However, when constructing ensembles with moderate ensemble size by suitable clustering algorithms for a given data set with uniform cluster distribution, the correlation coefficients between the diversity measures and ensemble performances are relatively high. Finally, the authors give some useful suggestions about the usefulness of diversity measures in building clustering ensembles.
Keywords:ensemble learning  clustering ensemble  diversity  measure
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号