首页 | 本学科首页   官方微博 | 高级检索  
     

针对小规模数据集的多模型融合算法研究
引用本文:李春生,曹琦,于澍.针对小规模数据集的多模型融合算法研究[J].计算机技术与发展,2020(2):63-66.
作者姓名:李春生  曹琦  于澍
作者单位:东北石油大学计算机与信息技术学院
基金项目:国家自然科学面上项目(51774090);黑龙江省自然科学基金面上项目(F2015020);黑龙江省教育科研专项引导性创新基金项目(2017YDL-12);黑龙江省教育规划重大课题(GJ20170006)
摘    要:目前,对小规模数据集进行预测时,主要使用传统机器学习算法,但传统单一模型预测效果不能达到预期准确率,且无法兼顾多项评价指标。因此,文中以小规模数据集为研究对象,融合决策树、逻辑回归、支持向量机三类模型,提出了一种多模型融合算法,并分析了其在小规模数据集上的应用效果。首先,简述了决策树、逻辑回归和支持向量机的算法原理;其次,使用决策树、逻辑回归和支持向量机作为基学习器并完成单独训练,将各模型输出结果用于下一阶段模型输入,同时使用最大似然估计迭代优化参数,从而完成多模型融合过程;最后,对数据集进行分析和处理,通过实验与单一模型进行指标对比。实验结果表明,多模型融合算法在预测精确率、召回率、准确率等方面有明显提升。

关 键 词:数据挖掘  机器学习  逻辑回归  决策树  模型融合

Research on Multi-model Fusion Algorithm for Small Scale Data Sets
LI Chun-sheng,CAO Qi,YU Shu.Research on Multi-model Fusion Algorithm for Small Scale Data Sets[J].Computer Technology and Development,2020(2):63-66.
Authors:LI Chun-sheng  CAO Qi  YU Shu
Affiliation:(School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China)
Abstract:At present,traditional machine learning algorithms are mainly used in the prediction of small-scale data sets,but the traditional single model cannot reach the expected accuracy in prediction effect and cannot take into account multiple evaluation indexes.Therefore,taking the small-scale data sets as research objects and integrating decision tree,logistic regression and support vector machine,we propose a multi-model fusion algorithm and analyze its application effect on small-scale data sets.Firstly,the algorithm principle of decision tree,logistic regression and support vector machine is briefly described.Secondly,decision tree,logistic regression and support vector machine are used as the base learner and the individual training is completed.The output results of each model are used for the model input in the next stage,and the maximum likelihood estimation is used for iterative optimization parameters to complete the multi-model fusion process.Finally,the data sets are analyzed and processed,and the indicators are compared with the single model through experiments which show that this algorithm has a significant improvement in prediction precision,recall rate and accuracy.
Keywords:data mining  machine learning  logistic regression  decision tree  model fusion
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号