首页 | 本学科首页   官方微博 | 高级检索  
     

基于特征迁移和实例迁移的跨项目缺陷预测方法
引用本文:倪超,陈翔,刘望舒,顾庆,黄启国,李娜.基于特征迁移和实例迁移的跨项目缺陷预测方法[J].软件学报,2019,30(5):1308-1329.
作者姓名:倪超  陈翔  刘望舒  顾庆  黄启国  李娜
作者单位:计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023,计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023;南通大学 计算机科学与技术学院, 江苏 南通 226019,计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023;南京工业大学 计算机科学与技术学院, 江苏 南京 211816,计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023,计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023,计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023
基金项目:国家自然科学基金(61373012,61202006,91218302,61321491);南京大学计算机软件新技术国家重点实验室开放课题(KFKT2016B18,KFKT2018B17);江苏省自然科学基金(BK20180695);国家建设高水平大学公派研究生项目(201806190172)
摘    要:在实际软件开发中,需要进行缺陷预测的项目可能是一个新启动项目,或者这个项目的历史训练数据较为稀缺.一种解决方案是利用其他项目(即源项目)已搜集的训练数据来构建模型,并完成对当前项目(即目标项目)的预测.但不同项目的数据集间会存在较大的分布差异性.针对该问题,从特征迁移和实例迁移角度出发,提出了一种两阶段跨项目缺陷预测方法FeCTrA.具体来说,在特征迁移阶段,该方法借助聚类分析选出源项目与目标项目之间具有高分布相似度的特征;在实例迁移阶段,该方法基于TrAdaBoost方法,借助目标项目中的少量已标注实例,从源项目中选出与这些已标注实例分布相近的实例.为了验证FeCTrA方法的有效性,选择Relink数据集和AEEEM数据集作为评测对象,以F1作为评测指标.首先,FeCTrA方法的预测性能要优于仅考虑特征迁移阶段或实例迁移阶段的单阶段方法;其次,与经典的跨项目缺陷预测方法TCA+、Peters过滤法、Burak过滤法以及DCPDP法相比,FeCTrA方法的预测性能在Relink数据集上可以分别提升23%、7.2%、9.8%和38.2%,在AEEEM数据集上可以分别提升96.5%、108.5%、103.6%和107.9%;最后,分析了FeCTrA方法内的影响因素对预测性能的影响,从而为有效使用FeCTrA方法提供了指南.

关 键 词:软件质量保障  软件缺陷预测  跨项目缺陷预测  迁移学习  特征迁移  实例迁移
收稿时间:2018/8/28 0:00:00
修稿时间:2018/10/31 0:00:00

Cross-project Defect Prediction Method Based on Feature Transfer and Instance Transfer
NI Chao,CHEN Xiang,LIU Wang-Shu,GU Qing,HUANG Qi-Guo and LI Na.Cross-project Defect Prediction Method Based on Feature Transfer and Instance Transfer[J].Journal of Software,2019,30(5):1308-1329.
Authors:NI Chao  CHEN Xiang  LIU Wang-Shu  GU Qing  HUANG Qi-Guo and LI Na
Affiliation:State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China,State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China;School of Computer Science and Technology, Nantong University, Nantong 226019, China,State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China;School of Computer Science and Technology, Nanjing Tech University, Nanjing 211816, China,State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China,State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China and State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China
Abstract:In real software development, a project, which needs defect prediction, may be a new project or maybe has less training data. A simple solution is to use training data from other projects (i.e., source projects) to construct the model, and use the trained model to perform prediction on the current project (i.e., target project). However, datasets among different projects may have large distribution difference. To solve this problem, a novel two phase cross-project defect prediction method FeCTrA is proposed, which considers both feature transfer and instance transfer. In the feature transfer phase, FeCTrA uses cluster analysis to select features, which have high distribution similarity between the source project and the target project. In the instance transfer phase, FeCTrA utilizes TrAdaBoost, which selects relevant instances from the source project when give some labeled instances in the target project. To verify the effectiveness of FeCTrA, Relink and AEEEM datasets are choosen as the experimental subjects and F1 as the performance measure. Firstly, it is found that FeCTrA outperforms single phase methods, which only consider feature transfer or instance transfer. Then after comparing with state-of-the-art baseline methods (i.e., TCA+, Peters filter, Burak filter, and DCPDP), the performance of FeCTrA improves 23%, 7.2%, 9.8%, and 38.2% on Relink dataset and the performance of FeCTrA improves 96.5%, 108.5%, 103.6%, and 107.9% on AEEEM dataset. Finally, the influence of factors in FeCTrA is analyzed and a guideline to effectively use this method is provided.
Keywords:software quality assurance  software defect prediction  cross-project defect prediction  transfer learning  feature transfer  instance transfer
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号