首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的软件缺陷预测模型
引用本文:陈凯,邵培南.基于深度学习的软件缺陷预测模型[J].计算机系统应用,2021,30(1):29-37.
作者姓名:陈凯  邵培南
作者单位:中国电子科技集团第三十二研究所, 上海 201808;中国电子科技集团第三十二研究所, 上海 201808
摘    要:为了提高软件的可靠性,软件缺陷预测已经成为软件工程领域中一个重要的研究方向.传统的软件缺陷预测方法主要是设计静态代码度量,并用机器学习分类器来预测代码的缺陷概率.但是,静态代码度量未能充分考虑到潜藏在代码中的语义特征.根据这种状况,本文提出了一种基于深度卷积神经网络的软件缺陷预测模型.首先,从源代码的抽象语法树中选择合适的结点提取表征向量,并构建字典将其映射为整数向量以方便输入到卷积神经网络.然后,基于GoogLeNet设计卷积神经网络,利用卷积神经网络的深度挖掘数据的能力,充分挖掘出特征中的语法语义特征.另外,模型使用了随机过采样的方法来处理数据分类不均衡问题,并在网络中使用丢弃法来防止模型过拟合.最后,用Promise上的历史工程数据来测试模型,并以AUC和F1-measure为指标与其他3种方法进行了比较,实验结果显示本文提出的模型在软件缺陷预测性能上得到了一定的提升.

关 键 词:软件缺陷预测  抽象语法树  卷积神经网络  随机过采样  丢弃法
收稿时间:2020/5/19 0:00:00
修稿时间:2020/6/16 0:00:00

Software Defect Prediction Model Based on Deep Learning
CHEN Kai,SHAO Pei-Nan.Software Defect Prediction Model Based on Deep Learning[J].Computer Systems& Applications,2021,30(1):29-37.
Authors:CHEN Kai  SHAO Pei-Nan
Affiliation:The 32nd Research Institute of China Electronics Technology Group Corporation, Shanghai 201808, China
Abstract:In order to improve the reliability of software, software defect prediction has become an important research direction in the field of software engineering. Traditional software defect prediction methods mainly design static code metrics and use machine learning classifiers to predict the defect probability of the code. However, the static code metrics do not fully consider the semantic features hidden in the code. According to this situation, this study proposes a software defect prediction model based on convolutional neural network. First, extract the characterization vectors from the appropriate nodes in the abstract syntax tree of the source code, and construct a dictionary to map them to integer vectors to facilitate input to the convolutional neural network. Then, a convolutional neural network is designed based on GoogLeNet, and the ability of the convolutional neural network to deeply mine data is used to fully mine the grammatical and semantic features of the features. In addition, this model uses the method of random oversampling to deal with the imbalance of data, and uses the method dropout in the network to prevent the model from overfitting. Finally, the historical engineering database on Promise is used to test the model, and AUC and F1-measure are used as indicators to compare with the other three methods. The results show that the proposed model has a certain improvement in software defect prediction performance.
Keywords:software defect prediction  abstract syntax tree  convolutional neural network  random oversampling  dropout
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号