首页 | 本学科首页   官方微博 | 高级检索  
     

基于Hadoop平台的分布式SVM参数寻优
引用本文:吴云蔚,宁芊. 基于Hadoop平台的分布式SVM参数寻优[J]. 计算机工程与科学, 2017, 39(6): 1042-1047
作者姓名:吴云蔚  宁芊
作者单位:;1.四川大学电子信息学院
基金项目:国家973计划(2013CB328903-2)
摘    要:参数的选择对算法分类与预测的正确率有直接影响。在参数选择中全局网格搜索有着计算可靠、简单、优化效果明显的优势,适合应用于可靠性要求高的工程运算,如在复杂系统的故障诊断中对故障模式识别算法进行参数寻优等。但是,全局网格搜索在寻优过程中耗时过长,仍然是一个制约其使用的问题,尤其对于实时性要求较高的系统。以支持向量机的参数全局寻优问题为例,针对网格搜索寻优时间长的缺点,利用Hadoop平台进行分布式参数寻优,借助HDFS将参数自动划分到计算节点上,并运用MapReduce计算框架建立分布式参数寻优模型,完成模型训练预测及参数优化。实验结果表明,在不降低算法性能的前提下提高了寻优效率。

关 键 词:Hadoop  MapReduce  支持向量机  网格搜索  参数寻优  分布式计算
收稿时间:2016-01-30
修稿时间:2017-06-25

Distributed SVM parameter optimization based on Hadoop
WU Yun-wei,NING Qian. Distributed SVM parameter optimization based on Hadoop[J]. Computer Engineering & Science, 2017, 39(6): 1042-1047
Authors:WU Yun-wei  NING Qian
Affiliation:(College of Electronics and Information Engineering,Sichuan University,Chengdu  610065,China)
Abstract:The classification and prediction accuracy of an algorithm are directly influenced by the choice of parameters, and among the methods of parameter selection, global grid search has obvious advantages, such as reliable and simple calculation, and obvious optimization effect, which are suitable for engineering operations that have high reliability requirement, for instance, parameter optimization of the fault pattern recognition algorithm in fault diagnosis of system. However, the global grid search is time-consuming in the search process, therefore there is still a constraint on use, especially for the system which has high real-time requirement. Using the global parameter optimization of support vector machine as a case, Hadoop platform is used for distributed parameter optimization in order to overcome the disadvantage of grid search. With HDFS, the parameters can be automatically divided into calculation nodes. We establish the distributed parameter optimization model by using the MapReduce computing framework, then conduct model training and prediction as well as parameter optimization. Experimental results show that the optimization efficiency is improved without reducing algorithm performance.
Keywords:Hadoop  MapReduce  support vector machine  grid search  parameter optimization  distributed computing  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号