首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进K-medoids的聚类质量评价指标研究
引用本文:邹臣嵩,段桂芹.基于改进K-medoids的聚类质量评价指标研究[J].计算机系统应用,2019,28(6):235-242.
作者姓名:邹臣嵩  段桂芹
作者单位:广东松山职业技术学院 电气工程系, 韶关 512126,广东松山职业技术学院 计算机系, 韶关 512126
基金项目:韶关市科技计划项目(2017CX/K055);广东松山职业技术学院重点科技项目(2018KJZD001)
摘    要:为了更好地评价无监督聚类算法的聚类质量,解决因簇中心重叠而导致的聚类评价结果失效等问题,对常用聚类评价指标进行了分析,提出一个新的内部评价指标,将簇间邻近边界点的最小距离平方和与簇内样本个数的乘积作为整个样本集的分离度,平衡了簇间分离度与簇内紧致度的关系;提出一种新的密度计算方法,将样本集与各样本的平均距离比值较大的对象作为高密度点,使用最大乘积法选取相对分散且具有较高密度的数据对象作为初始聚类中心,增强了K-medoids算法初始中心点的代表性和算法的稳定性,在此基础上,结合新提出的内部评价指标设计了聚类质量评价模型,在UCI和KDD CUP 99数据集上的实验结果表明,新模型能够对无先验知识样本进行有效聚类和合理评价,能够给出最优聚类数目或最优聚类范围.

关 键 词:聚类评价指标  K-medoids  无监督聚类  最优聚类数
收稿时间:2018/12/19 0:00:00
修稿时间:2019/1/10 0:00:00

Cluster Quality Evaluation Index Based on K-medoids Algorithm
ZOU Chen-Song and DUAN Gui-Qin.Cluster Quality Evaluation Index Based on K-medoids Algorithm[J].Computer Systems& Applications,2019,28(6):235-242.
Authors:ZOU Chen-Song and DUAN Gui-Qin
Affiliation:Department of Electrical Engineering, Guangdong Songshan Polytechnic, Shaoguan 512126, China and Department of Computer Science, Guangdong Songshan Polytechnic, Shaoguan 512126, China
Abstract:In order to better evaluate the clustering quality of unsupervised clustering algorithm and solve the problem of invalidation of clustering evaluation results caused by overlapping cluster centers, the commonly used cluster evaluation index is analyzed and a new internal evaluation index is proposed, the product of the minimum square of the distance between the adjacent boundary points and the number of samples in the cluster is taken as the separation degree of the whole sample set, the relation between the degree of separation between clusters and the degree of compactness within clusters is balanced; a new density calculation method is proposed, which takes the object with a larger average distance ratio between the sample set and each sample as a high-density point, and uses the maximum product method to select the relatively dispersed data object with a higher density as the initial cluster center, thus enhancing the representativeness of the initial center of K-medoids algorithm and the stability of the algorithm. On this basis, the cluster quality evaluation model is designed with the newly proposed internal evaluation index. The experimental results on UCI and KDD CUP 99 data sets show that the new model can effectively cluster and reasonably evaluate non-prior knowledge samples, and can give the optimal number or range of clustering.
Keywords:cluster evaluation index  K-medoids  unsupervised clustering  optimum clustering number
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号