首页 | 本学科首页   官方微博 | 高级检索  
     

基于混合采样与Random_Stacking的软件缺陷预测
引用本文:闫岭岭,江峰,杜军威,杨爱光.基于混合采样与Random_Stacking的软件缺陷预测[J].计算机与现代化,2021,0(8):70-76.
作者姓名:闫岭岭  江峰  杜军威  杨爱光
作者单位:青岛科技大学信息科学技术学院,山东 青岛 266061
基金项目:国家自然科学基金资助项目(61973180,61671261); 山东省自然科学基金资助项目(ZR2018MF007)
摘    要:现有的软件缺陷预测方法面临数据类别不平衡性、高维数据处理等问题。如何有效解决上述问题已成为目前相关领域的研究热点。针对软件缺陷预测所面临的类别不平衡、预测精度低等问题,本文提出一种基于混合采样与Random_Stacking的软件缺陷预测算法DP_HSRS。DP_HSRS算法首先采用混合采样算法对不平衡数据进行平衡化处理;然后在该平衡数据集上采用Random_Stacking算法进行软件缺陷预测。Random_Stacking算法是对传统Stacking算法的一种有效改进,它通过融合多个经典的分类算法以及Bagging机制构建多个Stacking分类器,对多个Stacking分类器进行投票,得到一个集成分类器,最后利用该集成分类器对软件缺陷进行预测。通过在NASA MDP数据集上的实验结果表明,DP_HSRS算法的性能优于现有的算法,具有更好的缺陷预测性能。

关 键 词:软件缺陷预测    数据不平衡    混合采样    Random_Stacking    DP_HSRS  
收稿时间:2021-08-19

Software Defect Prediction Based on Hybrid Sampling and Random_Stacking
YAN Ling-ling,JIANG Feng,DU Jun-wei,YANG Ai-guang.Software Defect Prediction Based on Hybrid Sampling and Random_Stacking[J].Computer and Modernization,2021,0(8):70-76.
Authors:YAN Ling-ling  JIANG Feng  DU Jun-wei  YANG Ai-guang
Abstract:The existing software defect prediction methods  face problems such as imbalance of data categories, high-dimensional data processing, and so on. How to effectively solve the above problems has become a research hotspot in related fields. Aiming at the problems of unbalanced categories and low prediction accuracy faced by software defect prediction, this paper proposes a software defect prediction algorithm DP_HSRS based on hybrid sampling and Random_Stacking. The DP_HSRS algorithm firstly uses a hybrid sampling algorithm to balance the unbalanced data, then uses the Random_Stacking algorithm to predict software defects on the balanced data set. The Random_Stacking algorithm is an effective improvement to the traditional Stacking algorithm. It constructs multiple Stacking classifiers by fusing multiple classic classification algorithms and the Bagging mechanism, votes multiple Stacking classifiers to obtain an integrated classifier, and finally uses the integrated classifier to predict software defects. The results of experiments on the NASA MDP data set show that the performance of the DP_HSRS algorithm is better than the existing algorithms, and it has better defect prediction performance.
Keywords:software defect prediction  data imbalance  mixed sampling  Random_Stacking  DP_HSRS  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号