一种采用对抗学习的跨项目缺陷预测方法 Cross-project Defect Prediction Method Using Adversarial Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种采用对抗学习的跨项目缺陷预测方法

引用本文：	邢颖,钱晓萌,管宇,章世豪,赵梦赐,林婉婷. 一种采用对抗学习的跨项目缺陷预测方法[J]. 软件学报, 2022, 33(6): 2097-2112

作者姓名：	邢颖钱晓萌管宇章世豪赵梦赐林婉婷

作者单位：	北京邮电大学人工智能学院, 北京 100876;北京邮电大学现代邮政学院(自动化学院), 北京 100876

基金项目：	国家自然科学基金(61702044); 国家重点研发计划课题(2017YFD0401001)

摘要：	跨项目缺陷预测(cross-project defect prediction, CPDP)已经成为软件工程数据挖掘领域的一个重要研究方向,它利用其他项目的缺陷代码来建立预测模型,解决了模型构建过程中的数据不足问题.然而源项目和目标项目的代码文件之间存在着数据分布的差异,导致跨项目预测效果不佳.基于生成式对抗网络(generative adversarial network,GAN)中的对抗学习思想,在鉴别器的作用下,通过改变目标项目特征的分布,使其接近于源项目特征的分布,从而提升跨项目缺陷预测的性能.具体来说,提出的抽象连续生成式对抗网络(abstract continuous generative adversarial network, AC-GAN)方法包括数据处理和模型构建两个阶段:(1)首先将源项目和目标项目的代码转换为抽象语法树(abstract syntax tree,AST)的形式,然后以深度优先方式遍历抽象语法树得出节点序列,再使用连续词袋模型(continuous bag-of-words model,CBOW)生成词向量,依据词向量表将节点序列转化为数值向量;(...
关键词：	跨项目缺陷预测生成式对抗网络连续词袋模型抽象语法树
收稿时间：	2021-09-05
修稿时间：	2021-10-15
Cross-project Defect Prediction Method Using Adversarial Learning

XING Ying,QIAN Xiao-Meng,GUAN Yu,ZHANG Shi-Hao,ZHAO Meng-Ci,LIN Wan-Ting. Cross-project Defect Prediction Method Using Adversarial Learning[J]. Journal of Software, 2022, 33(6): 2097-2112

Authors:	XING Ying QIAN Xiao-Meng GUAN Yu ZHANG Shi-Hao ZHAO Meng-Ci LIN Wan-Ting

Affiliation:	School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China;School of Modern Post(School of Automation), Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract:	Cross-project defect prediction(CPDP) has become an important research direction in data mining of software engineering, which uses the defective codes of other projects to build prediction models and solves the problem of insufficient data in the process of model construction. However, there is difference in data distribution between the code files of source and target projects, which leads to poor cross-project prediction results. Based on the adversarial learning idea of generative adversarial network (GAN), under the action of discriminator, this paper changes the distribution of target project features to make it similar to the distribution of source project features, so as to predict cross-project defects Specifically, the process of our proposed Abstract Continuous Generative Adversarial Network (AC-GAN) method consists of two stages: data processing and model construction.(1) First, the source and target project codes are converted into the form of abstract syntax trees (AST), and then the abstract syntax trees are traversed in a depth-first manner to derive the token sequences. The continuous bag-of-words model (CBOW) is used to generate word vectors, and the token sequences are transformed into numeric vectors based on the word vector table. (2) The processed numeric vectors are fed into a GAN network structure-based model for feature extraction and data migration. Finally, a binary classifier is used to determine whether the target project code files are defective or not. We conducted comparison experiments on 15 sets of source-target project pairs, and the experimental results demonstrate the effectiveness of the AC-GAN method.

Keywords:	Cross-project defect prediction Generative adversarial network Bag-of-words model Abstract syntax tree
本文献已被万方数据等数据库收录！
	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏