首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进K-modes聚类的KNN分类算法
引用本文:王志华,刘绍廷,罗齐.基于改进K-modes聚类的KNN分类算法[J].计算机工程与设计,2019,40(8):2228-2234.
作者姓名:王志华  刘绍廷  罗齐
作者单位:郑州大学软件与应用科技学院,河南郑州,450002;郑州大学软件与应用科技学院,河南郑州,450002;郑州大学软件与应用科技学院,河南郑州,450002
基金项目:国家社会科学基金;河南省科技攻关计划
摘    要:为解决K-modes算法初始化k簇时误差率较高和KNN(K最近邻算法)算法面对大样本数据量时分类不准确的现状,分析传统的K-modes算法从k簇的初始化到簇中心不再变化的全过程和KNN(K最近邻算法)算法在面对大样本数据时执行效率低下的问题,提出改进的K-modes-KNN算法。使用字符串核函数初始化k簇,字符串核函数迭代计算样本到簇中心的距离来动态改变簇中心,利用改进的K-modes算法将数据集进行分簇处理后,在每个子簇中建立KNN(K最近邻算法)分类模型。通过真实数据验证了所提算法在一定程度上优于同种分类算法。

关 键 词:K-modes算法  KNN算法  分类  簇中心  K-modes-KNN算法  字符串核函数

KNN classification algorithm based on improved K-modes clustering
WANG Zhi-hua,LIU Shao-ting,LUO Qi.KNN classification algorithm based on improved K-modes clustering[J].Computer Engineering and Design,2019,40(8):2228-2234.
Authors:WANG Zhi-hua  LIU Shao-ting  LUO Qi
Affiliation:(School of Software and Applied Science and Technology,Zhengzhou University,Zhengzhou 450002,China)
Abstract:To solve the problems that the K-modes algorithm initializes k clusters with high error rate and KNN (K nearest neighbor algorithm) algorithm is inaccurate when it faces large sample data volume,the problems that the traditional K-modes algorithm from the initialization of the k-cluster to the whole process of the cluster center is no longer changed and the KNN (K-nearest neighbor algorithm) algorithm is inefficient in the face of large sample data were analyzed.An improved K-modes-KNN algorithm was proposed.The string kernel function was used to initialize the k-cluster.The string kernel function was used to iteratively calculate the distance from the sample to the cluster center to dynamically change the cluster center,and the improved K-modes algorithm was used to cluster the data set after each sub-cluster.A KNN (K nearest neighbor algorithm) classification model was established.The real data of a research institute verified that the proposed algorithm is better than the same classification algorithm to some extent.
Keywords:K-modes algorithm  KNN algorithm  classification  cluster center  K-modes-KNN algorithm  string kernel function
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号