首页 | 本学科首页   官方微博 | 高级检索  
     

基于平均差异度的改进k-prototypes聚类算法
引用本文:石鸿雁,徐明明.基于平均差异度的改进k-prototypes聚类算法[J].沈阳工业大学学报,2019,41(5):555-559.
作者姓名:石鸿雁  徐明明
作者单位:沈阳工业大学 理学院, 沈阳 110870
基金项目:国家自然科学基金资助项目(61074005)
摘    要:针对k-prototypes聚类算法随机选取初始聚类中心导致聚类结果不稳定,以及现有的大多数混合属性数据聚类算法聚类质量不高等问题,提出了基于平均差异度的改进k-prototypes聚类算法.通过利用平均差异度选取初始聚类中心,避免了初始聚类中心点选取的随机性,同时利用信息熵确定数值数据的属性权重,并对分类属性度量公式进行改进,给出了一种混合属性数据度量公式.结果表明,改进后的算法具有较高的准确率,能够有效处理混合属性数据.

关 键 词:k-prototypes算法  聚类  初始聚类中心  混合属性数据  平均差异度  信息熵  属性权重  度量公式  

Improved k-prototypes clustering algorithm based on average difference degree
SHI Hong-yan,XU Ming-ming.Improved k-prototypes clustering algorithm based on average difference degree[J].Journal of Shenyang University of Technology,2019,41(5):555-559.
Authors:SHI Hong-yan  XU Ming-ming
Affiliation:School of Science, Shenyang University of Technology, Shenyang 110870, China
Abstract:In order to solve the problem that the random selection of initial cluster centers for the k-prototypes clustering algorithm brings about unstable clustering results and that the clustering quality of most currently existing clustering algorithms for mixed attribute data is not high, an improved k-prototypes algorithm based on average difference degree was proposed. Through using the average difference degree, the initial clustering centers were selected to avoid the selection randomness of initial clustering center points. In addition, the attribute weights of numerical data were determined by the information entropy, the metric formula of categorical attribute was improved, and a metric formula for the mixed attribute data was given. The results show that the improved algorithm can achieve better accuracy and can effectively process the data of mixed attribute.
Keywords:k-prototypes algorithm  clustering  initial clustering center  mixed attribute data  average difference degree  information entropy  attribute weight  metric formula  
本文献已被 CNKI 等数据库收录!
点击此处可从《沈阳工业大学学报》浏览原始摘要信息
点击此处可从《沈阳工业大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号