首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce编程模型的改进KNN分类算法研究
引用本文:邱宁佳,郭畅,杨华民,王鹏,温暖.基于MapReduce编程模型的改进KNN分类算法研究[J].长春理工大学学报,2017,40(1).
作者姓名:邱宁佳  郭畅  杨华民  王鹏  温暖
作者单位:长春理工大学 计算机科学技术学院,长春,130022
基金项目:吉林省科技发展计划重点科技攻关项目
摘    要:采用一种属性约简算法,将待分类的数据样本进行两次约简处理--初次决策表属性约简和基于核属性值的二次约简。通过属性约简方法来删除数据集中的冗余数据,进而提高KNN算法的分类精度。在此基础上应用MapReduce并行编程模型,在Hadoop集群环境上实现并行化分类计算实验。实验结果表明,改进后的算法在集群环境下执行的效率得到很大提升,能够高效处理实验数据。实验执行的加速比也有明显提高。

关 键 词:KNN  属性约简  MapReduce编程模型  Hadoop

The Research of Modified KNN Classification Algorithm Based on MapReduce Model
QIU Ningjia,GUO Chang,YANG Huamin,WANG Peng,WEN Nuan.The Research of Modified KNN Classification Algorithm Based on MapReduce Model[J].Journal of Changchun University of Science and Technology,2017,40(1).
Authors:QIU Ningjia  GUO Chang  YANG Huamin  WANG Peng  WEN Nuan
Abstract:An attribute reduction algorithm is proposed. The algorithm will be classified data samples for the two reduc-tion processing--attribute reduction of the initial decision table and second reduction based on kernel attribute value. The method of attribute reduction is to delete the redundant data, and then to improve the classification accuracy of KNN algorithm. On the basis of the application of the MapReduce parallel programming model, the parallel computing experiments are implemented in the Hadoop cluster environment. The experimental results show that the efficiency of the improved algorithm in the cluster environment has been greatly improved,which can effectively deal with the exper-imental data. Experimental implementation of the speedup is also significantly improved.
Keywords:KNN  attribute reduction  MapReduce programming model  hadoop
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号