首页 | 本学科首页   官方微博 | 高级检索  
     

一种面向跨项目软件缺陷预测的特征过滤与实例迁移框架
引用本文:刁旭炀,刘晓阳,徐利,陈天群,徐亚周.一种面向跨项目软件缺陷预测的特征过滤与实例迁移框架[J].计算机与现代化,2021,0(12):116-122.
作者姓名:刁旭炀  刘晓阳  徐利  陈天群  徐亚周
作者单位:上海机电工程研究所,上海 201109;上海航天电子技术研究所,上海 201109
摘    要:在跨项目软件缺陷预测中,源项目与目标项目的特征关联度与实例分布差异性是影响预测模型性能的主要因素。本文从特征过滤与实例迁移2个角度出发,提出一种跨项目软件缺陷预测框架KCF-KMM(K-medoids Cluster Filtering- Kernel Mean Matching)。在特征过滤阶段,该方法基于K-medoids聚类算法来筛选特征子集,过滤与目标项目关联度低的特征。在实例迁移阶段,通过KMM算法计算源项目与目标项目实例间的分布差异度,以此分配每个训练实例的影响权重。最后,结合目标项目中少量有标注数据建立混合缺陷预测模型。为了验证KCF-KMM的有效性,本文从准确率和F1值的角度出发,分别与经典的跨项目软件缺陷预测方法TCA+、TNB和NNFilter相比,KCF-KMM的预测性能在Apache数据集上可以分别提升34.1%、0.8%、21.1%和14.4%、3.7%、10.6%。

关 键 词:源项目    目标项目    特征关联度    分布差异性    特征过滤    实例迁移  
收稿时间:2021-12-24

Detection Method of One-shot Legend Based on Siamese Neural Networks
DIAO Xu-yang,LIU Xiao-yang,XU Li,CHEN Tian-qun,XU Ya-zhou.Detection Method of One-shot Legend Based on Siamese Neural Networks[J].Computer and Modernization,2021,0(12):116-122.
Authors:DIAO Xu-yang  LIU Xiao-yang  XU Li  CHEN Tian-qun  XU Ya-zhou
Abstract:In cross-project software defect prediction, the feature correlation and the difference in instance distribution between the source project and the target project are the main factors that affect the performance of the prediction model. From the perspective of feature filtering and instance transfer, we propose a framework for cross-project defect prediction called KCF-KMM. Specifically, during the feature filtering phase, it uses K-medoids clustering algorithm to select features, filtering out features that have low relevance to the target project. During the instance transfer phase, the KMM algorithm is used to calculate the distribution difference between the source project and the target project instance, so as to assign the influence weight of each training instance. Finally, it combines a small amount of labeled data in the target project to establish a mixed defect prediction model. To verify the effectiveness of KCF-KMM, it is compared with the classic cross-project software defect prediction methods such as TCA+, TNB and NNFilter from the perspective of accuracy and F1 value. The prediction performance of KCF-KMM can be improved by 34.1%, 0.8%, 21.1% and 14.4%, 3.7%, 10.6% on the Apache data set, respectively.
Keywords:legend detection  siamese network  data enhancement  one-shot learning  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号