首页 | 本学科首页   官方微博 | 高级检索  
     

基于半监督集成学习的软件缺陷预测*
引用本文:王铁建,吴飞,荆晓远. 基于半监督集成学习的软件缺陷预测*[J]. 模式识别与人工智能, 2017, 30(7): 646-652. DOI: 10.16451/j.cnki.issn1003-6059.201707007
作者姓名:王铁建  吴飞  荆晓远
作者单位:1.武汉大学 计算机学院 软件工程国家重点实验室 武汉 430072
2.南京邮电大学 自动化学院 南京 210023
基金项目:国家自然科学基金项目(No.61272273)资助
摘    要:在软件缺陷预测中,标记样本不足与类不平衡问题会影响预测结果.为了解决这些问题,文中提出基于半监督集成学习的软件缺陷预测方法.该方法利用大量存在的未标记样本进行学习,得到较好的分类器,同时能集成一系列弱分类器,减少多数类数据对预测产生的偏倚.考虑到预测风险成本问题,文中还采用训练样本集权重向量更新策略,降低有缺陷模块预测为无缺陷模块的风险.在NASA MDP数据集上的对比实验表明,文中方法具有较好的预测效果.

关 键 词:软件缺陷预测  类不平衡  半监督学习  集成学习  
收稿时间:2016-10-20

Semi-supervised Ensemble Learning Based Software Defect Prediction
WANG Tiejian,WU Fei,JING Xiaoyuan. Semi-supervised Ensemble Learning Based Software Defect Prediction[J]. Pattern Recognition and Artificial Intelligence, 2017, 30(7): 646-652. DOI: 10.16451/j.cnki.issn1003-6059.201707007
Authors:WANG Tiejian  WU Fei  JING Xiaoyuan
Affiliation:1.State Key Laboratory of Software Engineering, School of Computer, Wuhan University, Wuhan 430072
2.School of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210023
Abstract:The software defect prediction is usually adversely affected by the limitation of the labeled modules and the class-imbalance of software defect data. Aiming at this problem, a semi-supervised ensemble learning software defect prediction approach is proposed. High-performance classifiers can be built through semi-supervised ensemble learning by using a large amount of unlabeled modules and a better prediction capability is achieved for class-imbalanced data by using a series of weak classifiers to reduce the bias generated by the majority class. With the consideration of the cost of risk in software defect prediction, a sample weight vector updating strategy is employed to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. Experimental results on NASA MDP datasets show better software defect prediction capability of the proposed approach.
Keywords:Software Defect Prediction   Class-Imbalance   Semi-supervised Learning   Ensemble Learning  
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号