首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进高斯核度量和KPCA的数据聚类新方法
引用本文:余文利,余建军,方建文.基于改进高斯核度量和KPCA的数据聚类新方法[J].计算机系统应用,2017,26(10):150-155.
作者姓名:余文利  余建军  方建文
作者单位:衢州职业技术学院 信息工程学院, 衢州 324000,衢州职业技术学院 信息工程学院, 衢州 324000,衢州学院 电气与信息工程学院, 衢州 324000
摘    要:大多数超椭球聚类(hyper-ellipsoidal clustering,HEC)算法都使用马氏距离作为距离度量,已经证明在该条件下划分聚类的代价函数是常量,导致HEC无法实现椭球聚类.本文说明了使用改进高斯核的HEC算法可以解释为寻找体积和密度都紧凑的椭球分簇,并提出了一种实用HEC算法-K-HEC,该算法能够有效地处理椭球形、不同大小和不同密度的分簇.为实现更复杂形状数据集的聚类,使用定义在核特征空间的椭球来改进K-HEC算法的能力,提出了EK-HEC算法.仿真实验证明所提出算法在聚类结果和性能上均优于K-means算法、模糊C-means算法、GMM-EM算法和基于最小体积椭球(minimum-volume ellipsoids,MVE)的马氏HEC算法,从而证明了本文算法的可行性和有效性.

关 键 词:数据聚类  超椭球聚类  最小体积椭球  核主成分分析  高斯核
收稿时间:2017/1/11 0:00:00

Novel Data Clustering Method Based on A Modified Gaussian Kernel Metric and Kernel PCA
YU Wen-Li,YU Jian-Jun and FANG Jian-Wen.Novel Data Clustering Method Based on A Modified Gaussian Kernel Metric and Kernel PCA[J].Computer Systems& Applications,2017,26(10):150-155.
Authors:YU Wen-Li  YU Jian-Jun and FANG Jian-Wen
Affiliation:College of Information Engineering, Quzhou College of Technology, Quzhou 324000, China,College of Information Engineering, Quzhou College of Technology, Quzhou 324000, China and College of Electrical and Information Engineering, Quzhou University, Quzhou 324000, China
Abstract:Most hyper-ellipsoidal clustering(HEC) algorithms use the Mahalanobis distance as a distance metric. It has been proven that HEC, under this condition, cannot be realized since the cost function of partitional clustering is a constant. We demonstrate that HEC with a modified Gaussian kernel metric can be interpreted as a problem of finding condensed ellipsoidal clusters(with respect to the volumes and densities of the clusters) and propose a practical HEC algorithm named K-HEC that is able to efficiently handle clusters that are ellipsoidal in shape and that are of different size and density. We then try to refine the K-HEC algorithm by utilizing ellipsoids defined on the kernel feature space to deal with more complex-shaped clusters. Simulation experiments demonstrate the proposed methods have a significant improvement in the clustering results and performance over K-means algorithm, fuzzy C-means algorithm, GMM-EM algorithm and HEC algorithm based on minimum-volume ellipsoids using Mahalanobis distance.
Keywords:data clustering  hyper-ellipsoidal clustering  minimum-volume ellipsoids  kernel PCA  Gaussian kernel
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号