基于深度学习的软件缺陷预测模型 Software Defect Prediction Model Based on Deep Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度学习的软件缺陷预测模型

引用本文：	陈凯,邵培南.基于深度学习的软件缺陷预测模型[J].计算机系统应用,2021,30(1):29-37.

作者姓名：	陈凯邵培南

作者单位：	中国电子科技集团第三十二研究所, 上海 201808;中国电子科技集团第三十二研究所, 上海 201808

摘要：	为了提高软件的可靠性,软件缺陷预测已经成为软件工程领域中一个重要的研究方向.传统的软件缺陷预测方法主要是设计静态代码度量,并用机器学习分类器来预测代码的缺陷概率.但是,静态代码度量未能充分考虑到潜藏在代码中的语义特征.根据这种状况,本文提出了一种基于深度卷积神经网络的软件缺陷预测模型.首先,从源代码的抽象语法树中选择合适的结点提取表征向量,并构建字典将其映射为整数向量以方便输入到卷积神经网络.然后,基于GoogLeNet设计卷积神经网络,利用卷积神经网络的深度挖掘数据的能力,充分挖掘出特征中的语法语义特征.另外,模型使用了随机过采样的方法来处理数据分类不均衡问题,并在网络中使用丢弃法来防止模型过拟合.最后,用Promise上的历史工程数据来测试模型,并以AUC和F1-measure为指标与其他3种方法进行了比较,实验结果显示本文提出的模型在软件缺陷预测性能上得到了一定的提升.
关键词：	软件缺陷预测抽象语法树卷积神经网络随机过采样丢弃法
收稿时间：	2020/5/19 0:00:00
修稿时间：	2020/6/16 0:00:00
Software Defect Prediction Model Based on Deep Learning

CHEN Kai,SHAO Pei-Nan.Software Defect Prediction Model Based on Deep Learning[J].Computer Systems& Applications,2021,30(1):29-37.

Authors:	CHEN Kai SHAO Pei-Nan

Affiliation:	The 32nd Research Institute of China Electronics Technology Group Corporation, Shanghai 201808, China

Abstract:	In order to improve the reliability of software, software defect prediction has become an important research direction in the field of software engineering. Traditional software defect prediction methods mainly design static code metrics and use machine learning classifiers to predict the defect probability of the code. However, the static code metrics do not fully consider the semantic features hidden in the code. According to this situation, this study proposes a software defect prediction model based on convolutional neural network. First, extract the characterization vectors from the appropriate nodes in the abstract syntax tree of the source code, and construct a dictionary to map them to integer vectors to facilitate input to the convolutional neural network. Then, a convolutional neural network is designed based on GoogLeNet, and the ability of the convolutional neural network to deeply mine data is used to fully mine the grammatical and semantic features of the features. In addition, this model uses the method of random oversampling to deal with the imbalance of data, and uses the method dropout in the network to prevent the model from overfitting. Finally, the historical engineering database on Promise is used to test the model, and AUC and F1-measure are used as indicators to compare with the other three methods. The results show that the proposed model has a certain improvement in software defect prediction performance.

Keywords:	software defect prediction abstract syntax tree convolutional neural network random oversampling dropout
本文献已被万方数据等数据库收录！
	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏