期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Integrating in-process software defect prediction with association mining to discover defect pattern

Ching-Pao Chang Chih-Ping Chu Yu-Fang Yeh 《Information and Software Technology》2009,51(2):375-384

Rather than detecting defects at an early stage to reduce their impact, defect prevention means that defects are prevented from occurring in advance. Causal analysis is a common approach to discover the causes of defects and take corrective actions. However, selecting defects to analyze among large amounts of reported defects is time consuming, and requires significant effort. To address this problem, this study proposes a defect prediction approach where the reported defects and performed actions are utilized to discover the patterns of actions which are likely to cause defects. The approach proposed in this study is adapted from the Action-Based Defect Prediction (ABDP), an approach uses the classification with decision tree technique to build a prediction model, and performs association rule mining on the records of actions and defects. An action is defined as a basic operation used to perform a software project, while a defect is defined as software flaws and can arise at any stage of the software process. The association rule mining finds the maximum rule set with specific minimum support and confidence and thus the discovered knowledge can be utilized to interpret the prediction models and software process behaviors. The discovered patterns then can be applied to predict the defects generated by the subsequent actions and take necessary corrective actions to avoid defects.The proposed defect prediction approach applies association rule mining to discover defect patterns, and multi-interval discretization to handle the continuous attributes of actions. The proposed approach is applied to a business project, giving excellent prediction results and revealing the efficiency of the proposed approach. The main benefit of using this approach is that the discovered defect patterns can be used to evaluate subsequent actions for in-process projects, and reduce variance of the reported data resulting from different projects. Additionally, the discovered patterns can be used in causal analysis to identify the causes of defects for software process improvement. 相似文献

2.

改进PSO-ISVM算法的软件缺陷预测

张飞《计算机工程与应用》2016,52(11):17-21

提出基于改进的粒子群优化支持向量机方法（PSO-ISVM）的测控软件缺陷预测方法。通过引入代价惩罚系数,定义粒子群优化算法中的适应度函数,利用最小化适应度函数值作为优化目标,排除大量的冗余干扰信息,提高对测控软件有缺陷模块的预测准确度,寻找支持向量机的最优参数。通过仿真实例分析测控软件有效性,并与常用缺陷预测方法进行比较,表明该模型能加快软件缺陷预测速度和提高对有缺陷模块的预测准确度。相似文献

3.

Predicting weekly defect inflow in large software projects based on project planning and test status

Miroslaw Staron Wilhelm Meding 《Information and Software Technology》2008,50(7-8):782-796

Defects discovered during the testing phase in software projects need to be removed before the software is shipped to the customers. The removal of defects can constitute a significant amount of effort in a project and project managers are faced with a decision whether to continue development or shift some resources to cope with defect removal. The goal of this research is to improve the practice of project management by providing a method for predicting the number of defects reported into the defect database in the project. In this paper we present a method for predicting the number of defects reported into the defect database in a large software project on a weekly basis. The method is based on using project progress data, in particular the information about the test progress, to predict defect inflow in the next three coming weeks. The results show that the prediction accuracy of our models is up to 72% (mean magnitude of relative error for predictions of 1 week in advance is 28%) when used in ongoing large software projects. The method is intended to support project managers in more accurate adjusting resources in the project, since they are notified in advance about the potentially large effort needed to correct defects. 相似文献

4.

A critique of software defect prediction models 总被引：4，自引：0，他引：4

Fenton N.E. Neil M. 《IEEE transactions on pattern analysis and machine intelligence》1999,25(5):675-689

Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the “quality” of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of “software decomposition” in order to test hypotheses about defect introduction and help construct a better science of software engineering 相似文献

5.

基于CS-ANN的软件缺陷预测模型研究

王海林于倩李彤郁湧明利孙金文《计算机应用研究》2017,34(2)

为了提高软件缺陷预测的准确率,利用布谷鸟搜索算法(Cuckoo Search,CS)的寻优能力和人工神经网络算法（Artificial Neural Network,ANN）的非线性计算能力,提出了基于CS-ANN的软件缺陷预测方法。此方法首先使用基于关联规则的特征选择算法降低数据的维度,去除了噪声属性;利用布谷鸟搜索算法寻找神经网络算法的权值,然后使用权值和神经网络算法构建出预测模型;最后使用此模型完成缺陷预测。使用公开的NASA数据集进行仿真实验,结果表明该模型降低了误报率并提高了预测的准确率,综合评价指标AUC（area under the ROC curve）、F1值和G-mean都优于现有模型。相似文献

6.

Predicting the Flow of Defect Correction Effort using a Bayesian Network Model

Thomas Schulz Łukasz Radliński Thomas Gorges Wolfgang Rosenstiel 《Empirical Software Engineering》2013,18(3):435-477

The number of defects alone does not provide software companies with enough information on the effort required to fix them. Defects have different impacts on the overall defect correction effort – defects introduced in one phase may be found and corrected in the same or later phase. The later they are found, the more effort is required to correct them. The main aim of this paper is to build and validate a model (Bayesian Network) for predicting the defect correction effort at various phases of the software development process. The procedure of building the model covers the following steps: problem analysis, data analysis, model definition and enhancement, simulation runs, and model validation. Developed Defect Cost Flow Model (DCFM), which is an implementation of the V-model of a software project lifecycle, correctly incorporates known qualitative and quantitative relationships. Application of DCFM in a real industrial process revealed its high potential in finding the appropriate amount of review effort for specific development phases to minimize the overall costs. The model may be used in the industry for decision support. It can be extended and calibrated to meet the needs of specific development environment. 相似文献

7.

A method for forecasting defect backlog in large streamline software development projects and its industrial evaluation

Miroslaw Staron Wilhelm Meding Bo Söderqvist 《Information and Software Technology》2010,52(10):1069-1079

ContextPredicting a number of defects to be resolved in large software projects (defect backlog) usually requires complex statistical methods and thus is hard to use on a daily basis by practitioners in industry. Making predictions in simpler and more robust way is often required by practitioners in software engineering industry.ObjectiveThe objective of this paper is to present a simple and reliable method for forecasting the level of defect backlog in large, lean-based software development projects.MethodThe new method was created as part of an action research project conducted at Ericsson. In order to create the method we have evaluated multivariate linear regression, expert estimations and analogy-based predictions w.r.t. their accuracy and ease-of-use in industry. We have also evaluated the new method in a life project at one of the units of Ericsson during a period of 21 weeks (from the beginning of the project until the release of the product).ResultsThe method for forecasting the level of defect backlog uses an indicator of the trend (an arrow) as a basis to forecast the level of defect backlog. Forecasts are based on moving average which combined with the current level of defect backlog was found to be the best prediction method (Mean Magnitude of Relative Error of 16%) for the level of future defect backlog.ConclusionWe have found that ease-of-use and accuracy are the main aspects for practitioners who use predictions in their work. In this paper it is concluded that using the simple moving average provides a sufficiently-good accuracy (much appreciated by practitioners involved in the study). We also conclude that using the indicator (forecasting the trend) instead of the absolute number of defects in the backlog increases the confidence in our method compared to our previous attempts (regression, analogy-based, and expert estimates). 相似文献

8.

Evaluating defect prediction approaches: a benchmark and an extensive comparison

Marco D’Ambros Michele Lanza Romain Robbes 《Empirical Software Engineering》2012,17(4-5):531-577

Reliably predicting software defects is one of the holy grails of software engineering. Researchers have devised and implemented a plethora of defect/bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark makes it hard, if not impossible, to compare approaches. We present a benchmark for defect prediction, in the form of a publicly available dataset consisting of several software systems, and provide an extensive comparison of well-known bug prediction approaches, together with novel approaches we devised. We evaluate the performance of the approaches using different performance indicators: classification of entities as defect-prone or not, ranking of the entities, with and without taking into account the effort to review an entity. We performed three sets of experiments aimed at (1) comparing the approaches across different systems, (2) testing whether the differences in performance are statistically significant, and (3) investigating the stability of approaches across different learners. Our results indicate that, while some approaches perform better than others in a statistically significant manner, external validity in defect prediction is still an open problem, as generalizing results to different contexts/learners proved to be a partially unsuccessful endeavor. 相似文献

9.

基于Weka的神经网络软件缺陷预测初探

冯必波《工业控制计算机》2012,25(2):67-68

软件缺陷是对软件产品预期属性的偏离现象.它是影响软件质量的重要和关键因素之一.发现与排除软件缺陷是软件生命周期中的重要工作之一.每一个软件组织都知道必须妥善处理软件中的缺陷,这是关系到软件组织生存、发展的质量根本.针对软件缺陷预测方法中常用的前向反馈神经网络方法,结合Weka数据挖掘技术中的参数设定的科学性方法,有效的运用于节点数设计、网络结构的设计上,不断改进神经网络预测软件缺陷的置信度,从而让预测的结果更加合理化. 相似文献

10.

基于数据过采样和集成学习的软件缺陷数目预测方法

简艺恒余啸《计算机应用》2018,38(9):2637-2643

预测软件缺陷的数目有助于软件测试人员更多地关注缺陷数量多的模块,从而合理地分配有限的测试资源。针对软件缺陷数据集不平衡的问题,提出了一种基于数据过采样和集成学习的软件缺陷数目预测方法——SMOTENDEL。首先,对原始软件缺陷数据集进行n次过采样,得到n个平衡的数据集;然后基于这n个平衡的数据集利用回归算法训练出n个个体软件缺陷数目预测模型;最后对这n个个体模型进行结合得到一个组合软件缺陷数目预测模型,利用该组合预测模型对新的软件模块的缺陷数目进行预测。实验结果表明SMOTENDEL相比原始的预测方法在性能上有较大提升,当分别利用决策树回归（DTR）、贝叶斯岭回归（BRR）和线性回归（LR）作为个体预测模型时,提升率分别为7.68%、3.31%和3.38%。相似文献

11.

A study of subgroup discovery approaches for defect prediction

《Information and Software Technology》2013,55(10):1810-1822

相似文献

12.

Time variance and defect prediction in software projects

Jayalath Ekanayake Jonas Tappolet Harald C. Gall Abraham Bernstein 《Empirical Software Engineering》2012,17(4-5):348-389

It is crucial for a software manager to know whether or not one can rely on a bug prediction model. A wrong prediction of the number or the location of future bugs can lead to problems in the achievement of a project’s goals. In this paper we first verify the existence of variability in a bug prediction model’s accuracy over time both visually and statistically. Furthermore, we explore the reasons for such a high variability over time, which includes periods of stability and variability of prediction quality, and formulate a decision procedure for evaluating prediction models before applying them. To exemplify our findings we use data from four open source projects and empirically identify various project features that influence the defect prediction quality. Specifically, we observed that a change in the number of authors editing a file and the number of defects fixed by them influence the prediction quality. Finally, we introduce an approach to estimate the accuracy of prediction models that helps a project manager decide when to rely on a prediction model. Our findings suggest that one should be aware of the periods of stability and variability of prediction quality and should use approaches such as ours to assess their models’ accuracy in advance. 相似文献

13.

An empirical study on software defect prediction with a simplified metric set

《Information and Software Technology》2015

ContextSoftware defect prediction plays a crucial role in estimating the most defect-prone components of software, and a large number of studies have pursued improving prediction accuracy within a project or across projects. However, the rules for making an appropriate decision between within- and cross-project defect prediction when available historical data are insufficient remain unclear.ObjectiveThe objective of this work is to validate the feasibility of the predictor built with a simplified metric set for software defect prediction in different scenarios, and to investigate practical guidelines for the choice of training data, classifier and metric subset of a given project.MethodFirst, based on six typical classifiers, three types of predictors using the size of software metric set were constructed in three scenarios. Then, we validated the acceptable performance of the predictor based on Top-k metrics in terms of statistical methods. Finally, we attempted to minimize the Top-k metric subset by removing redundant metrics, and we tested the stability of such a minimum metric subset with one-way ANOVA tests.ResultsThe study has been conducted on 34 releases of 10 open-source projects available at the PROMISE repository. The findings indicate that the predictors built with either Top-k metrics or the minimum metric subset can provide an acceptable result compared with benchmark predictors. The guideline for choosing a suitable simplified metric set in different scenarios is presented in Table 12.ConclusionThe experimental results indicate that (1) the choice of training data for defect prediction should depend on the specific requirement of accuracy; (2) the predictor built with a simplified metric set works well and is very useful in case limited resources are supplied; (3) simple classifiers (e.g., Naïve Bayes) also tend to perform well when using a simplified metric set for defect prediction; and (4) in several cases, the minimum metric subset can be identified to facilitate the procedure of general defect prediction with acceptable loss of prediction precision in practice. 相似文献

14.

Classification with reject option for software defect prediction

《Applied Soft Computing》2016

ContextSoftware defect prediction (SDP) is an important task in software engineering. Along with estimating the number of defects remaining in software systems and discovering defect associations, classifying the defect-proneness of software modules plays an important role in software defect prediction. Several machine-learning methods have been applied to handle the defect-proneness of software modules as a classification problem. This type of “yes” or “no” decision is an important drawback in the decision-making process and if not precise may lead to misclassifications. To the best of our knowledge, existing approaches rely on fully automated module classification and do not provide a way to incorporate extra knowledge during the classification process. This knowledge can be helpful in avoiding misclassifications in cases where system modules cannot be classified in a reliable way.ObjectiveWe seek to develop a SDP method that (i) incorporates a reject option in the classifier to improve the reliability in the decision-making process; and (ii) makes it possible postpone the final decision related to rejected modules for an expert analysis or even for another classifier using extra domain knowledge.MethodWe develop a SDP method called rejoELM and its variant, IrejoELM. Both methods were built upon the weighted extreme learning machine (ELM) with reject option that makes it possible postpone the final decision of non-classified modules, the rejected ones, to another moment. While rejoELM aims to maximize the accuracy for a rejection rate, IrejoELM maximizes the F-measure. Hence, IrejoELM becomes an alternative for classification with reject option for imbalanced datasets.ResultsrejoEM and IrejoELM are tested on five datasets of source code metrics extracted from real world open-source software projects. Results indicate that rejoELM has an accuracy for several rejection rates that is comparable to some state-of-the-art classifiers with reject option. Although IrejoELM shows lower accuracies for several rejection rates, it clearly outperforms all other methods when the F-measure is used as a performance metric.ConclusionIt is concluded that rejoELM is a valid alternative for classification with reject option problems when classes are nearly equally represented. On the other hand, IrejoELM is shown to be the best alternative for classification with reject option on imbalanced datasets. Since SDP problems are usually characterized as imbalanced learning problems, the use of IrejoELM is recommended. 相似文献

15.

结合欠抽样与集成的软件缺陷预测

李勇《计算机应用》2014,34(8):2291-2294

软件缺陷预测是提高测试效率、保证软件可靠性的重要途径。为了提高软件缺陷预测的准确率,提出一种结合欠抽样与决策树分类器集成的软件缺陷预测模型。考虑到软件缺陷数据的类不平衡特性,首先,通过数据的不平衡率确定抽样度,执行欠抽样实现数据的重新平衡;然后,采用Bagging随机抽样原理训练若干个决策树子分类器;最后,按照少数服从多数的原则生成预测模型。使用公开的NASA软件缺陷预测数据集进行了仿真实验。实验结果表明,与3种基准方法对比,所提模型在保证预报率的前提下,误报率(PF)降低了10%以上,综合评价指标均有显著提升。该模型的缺陷预测误报率较低,而且具有较高的预测准确率与稳定性。相似文献

16.

A fuzzy logic based approach for phase-wise software defects prediction using software metrics

《Information and Software Technology》2015

ContextThe software defect prediction during software development has recently attracted the attention of many researchers. The software defect density indicator prediction in each phase of software development life cycle (SDLC) is desirable for developing a reliable software product. Software defect prediction at the end of testing phase may not be more beneficial because the changes need to be performed in the previous phases of SDLC may require huge amount of money and effort to be spent in order to achieve target software quality. Therefore, phase-wise software defect density indicator prediction model is of great importance.ObjectiveIn this paper, a fuzzy logic based phase-wise software defect prediction model is proposed using the top most reliability relevant metrics of the each phase of the SDLC.MethodIn the proposed model, defect density indicator in requirement analysis, design, coding and testing phase is predicted using nine software metrics of these four phases. The defect density indicator metric predicted at the end of the each phase is also taken as an input to the next phase. Software metrics are assessed in linguistic terms and fuzzy inference system has been employed to develop the model.ResultsThe predictive accuracy of the proposed model is validated using twenty real software project data. Validation results are satisfactory. Measures based on the mean magnitude of relative error and balanced mean magnitude of relative error decrease significantly as the software project size increases.ConclusionIn this paper, a fuzzy logic based model is proposed for predicting software defect density indicator at each phase of the SDLC. The predicted defects of twenty different software projects are found very near to the actual defects detected during testing. The predicted defect density indicators are very helpful to analyze the defect severity in different artifacts of SDLC of a software project. 相似文献

17.

Applying the Mahalanobis-Taguchi strategy for software defect diagnosis

Dimitris Liparas Lefteris Angelis Robert Feldt 《Automated Software Engineering》2012,19(2):141-165

The Mahalanobis-Taguchi (MT) strategy combines mathematical and statistical concepts like Mahalanobis distance, Gram-Schmidt orthogonalization and experimental designs to support diagnosis and decision-making based on multivariate data. The primary purpose is to develop a scale to measure the degree of abnormality of cases, compared to “normal” or “healthy” cases, i.e. a continuous scale from a set of binary classified cases. An optimal subset of variables for measuring abnormality is then selected and rules for future diagnosis are defined based on them and the measurement scale. This maps well to problems in software defect prediction based on a multivariate set of software metrics and attributes. In this paper, the MT strategy combined with a cluster analysis technique for determining the most appropriate training set, is described and applied to well-known datasets in order to evaluate the fault-proneness of software modules. The measurement scale resulting from the MT strategy is evaluated using ROC curves and shows that it is a promising technique for software defect diagnosis. It compares favorably to previously evaluated methods on a number of publically available data sets. The special characteristic of the MT strategy that it quantifies the level of abnormality can also stimulate and inform discussions with engineers and managers in different defect prediction situations. 相似文献

18.

基于自编码和知识蒸馏的表面缺陷检测方法

刘太亨何昭水《计算机应用》2021,41(11):3200-3205

针对传统的表面缺陷检测方法只能对具有高对比度或低噪声的明显缺陷轮廓进行检测的问题,提出了一种基于自编码和知识蒸馏的表面缺陷检测方法来准确定位和分类从实际工业环境捕获的输入图像中出现的缺陷。首先,设计了一种级联自动编码器（CAE）架构用于分割和定位缺陷,其目的是将输入的原始图像转换为基于CAE的预测蒙版;其次,利用阈值模块对预测结果进行二值化以获得准确的缺陷轮廓;然后,把缺陷区域检测器提取并裁剪出来的缺陷区域视为下一个模块的输入;最后,将CAE分割结果的缺陷区域通过知识蒸馏进行类别分类。实验结果表明,与其他几种表面缺陷检测方法相比,所提出的方法综合性能最好,其缺陷检测平均准确率为97.00%。该方法能够有效地对较小的、边缘不清晰的缺陷进行分割,满足对物品表面缺陷实时分割检测的工程要求。相似文献

19.

A family of experiments to investigate the effects of groupware for software inspection

Stefan Biffl Paul Grünbacher Michael Halling 《Automated Software Engineering》2006,13(3):373-394

It is widely accepted that the inspection of software artifacts can find defects early in the development process and gather information on the quality of the evolving product. However, the inspection process is resource-intensive and involves tedious tasks, such as searching, sorting, and checking. Tool support for inspections can help accelerating these tasks and allows inspectors to concentrate on tasks particularly needing human attention. Only few tools are available for inspections. We have thus developed a set of groupware tools for both individual defect detection and inspection meetings to lower the effort of inspections and to increase their efficiency. This paper presents the Groupware-supported Inspection Process (GrIP) and describes tools for inspecting software requirements. As only little empirical work exists that directly compares paper-based and tool-based software inspection, we conducted a family of experiments in an academic environment to empirically investigate the effect of tool support regarding defect detection and inspection meetings. The main results of our family of experiments regarding individual defect detection are promising: The effectiveness of inspectors and teams is comparable to paper-based inspection without tool support; the inspection effort and defect overlap decreases significantly with tool support, while the efficiency of inspection teams increases considerably. Regarding tool support for inspection meetings the main findings of the experiments are that tool support considerably lowers the meeting effort, supports inspectors in identifying false positives, and reduces the number of true defects lost during a meeting. The number of unidentified false positives is still quite high. 相似文献

20.

Software defect prediction using Bayesian networks

Ahmet Okutan Olcay Taner Yıldız 《Empirical Software Engineering》2014,19(1):154-181

There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings. 相似文献