首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

2.
ContextIn a large object-oriented software system, packages play the role of modules which group related classes together to provide well-identified services to the rest of the system. In this context, it is widely believed that modularization has a large influence on the quality of packages. Recently, Sarkar, Kak, and Rama proposed a set of new metrics to characterize the modularization quality of packages from important perspectives such as inter-module call traffic, state access violations, fragile base-class design, programming to interface, and plugin pollution. These package-modularization metrics are quite different from traditional package-level metrics, which measure software quality mainly from size, extensibility, responsibility, independence, abstractness, and instability perspectives. As such, it is expected that these package-modularization metrics should be useful predictors for fault-proneness. However, little is currently known on their actual usefulness for fault-proneness prediction, especially compared with traditional package-level metrics.ObjectiveIn this paper, we examine the role of these new package-modularization metrics for determining software fault-proneness in object-oriented systems.MethodWe first use principal component analysis to analyze whether these new package-modularization metrics capture additional information compared with traditional package-level metrics. Second, we employ univariate prediction models to investigate how these new package-modularization metrics are related to fault-proneness. Finally, we build multivariate prediction models to examine the ability of these new package-modularization metrics for predicting fault-prone packages.ResultsOur results, based on six open-source object-oriented software systems, show that: (1) these new package-modularization metrics provide new and complementary views of software complexity compared with traditional package-level metrics; (2) most of these new package-modularization metrics have a significant association with fault-proneness in an expected direction; and (3) these new package-modularization metrics can substantially improve the effectiveness of fault-proneness prediction when used with traditional package-level metrics together.ConclusionsThe package-modularization metrics proposed by Sarkar, Kak, and Rama are useful for practitioners to develop quality software systems.  相似文献   

3.
ContextEffort-aware models, e.g., effort-aware bug prediction models aim to help practitioners identify and prioritize buggy software locations according to the effort involved with fixing the bugs. Since the effort of current bugs is not yet known and the effort of past bugs is typically not explicitly recorded, effort-aware bug prediction models are forced to use approximations, such as the number of lines of code (LOC) of the predicted files.ObjectiveAlthough the choice of these approximations is critical for the performance of the prediction models, there is no empirical evidence on whether LOC is actually a good approximation. Therefore, in this paper, we investigate the question: is LOC a good measure of effort for use in effort-aware models?MethodWe perform an empirical study on four open source projects, for which we obtain explicitly-recorded effort data, and compare the use of LOC to various complexity, size and churn metrics as measures of effort.ResultsWe find that using a combination of complexity, size and churn metrics are a better measure of effort than using LOC alone. Furthermore, we examine the impact of our findings on previous effort-aware bug prediction work and find that using LOC as a measure for effort does not significantly affect the list of files being flagged, however, using LOC under-estimates the amount of effort required compared to our best effort predictor by approximately 66%.ConclusionStudies using effort-aware models should not assume that LOC is a good measure of effort. For the case of effort-aware bug prediction, using LOC provides results that are similar to combining complexity, churn, size and LOC as a proxy for effort when prioritizing the most risky files. However, we find that for the purpose of effort-estimation, using LOC may under-estimate the amount of effort required.  相似文献   

4.
The identification of a module's fault-proneness is very important for minimizing cost and improving the effectiveness of the software development process. How to obtain the correlation between software metrics and module's fault-proneness has been the focus of much research. This paper presents the application of hybrid artificial neural network (ANN) and Quantum Particle Swarm Optimization (QPSO) in software fault-proneness prediction. ANN is used for classifying software modules into fault-proneness or non fault-proneness categories, and QPSO is applied for reducing dimensionality. The experiment results show that the proposed prediction approach can establish the correlation between software metrics and modules’ fault-proneness, and is very simple because its implementation requires neither extra cost nor expert's knowledge. Proposed prediction approach can provide the potential software modules with fault-proneness to software developers, so developers only need to focus on these software modules, which may minimize effort and cost of software maintenance.  相似文献   

5.
ContextCode ownership metrics were recently defined in order to distinguish major and minor contributors of a software module, and to assess whether the ownership of such a module is strong or shared between developers.ObjectiveThe relationship between these metrics and software quality was initially validated on proprietary software projects. Our objective in this paper is to evaluate such relationship in open-source software projects, and to compare these metrics to other code and process metrics.MethodOn a newly crafted dataset of seven open-source software projects, we perform, using inferential statistics, an analysis of code ownership metrics and their relationship with software quality.ResultsWe confirm the existence of a relationship between code ownership and software quality, but the relative importance of ownership metrics in multiple linear regression models is low compared to metrics such as the number of lines of code, the number of modifications performed over the last release, or the number of developers of a module.ConclusionAlthough we do find a relationship between code ownership and software quality, the added value of ownership metrics compared to other metrics is still to be proven.  相似文献   

6.
Object-oriented metrics aim to exhibit the quality of source code and give insight to it quantitatively. Each metric assesses the code from a different aspect. There is a relationship between the quality level and the risk level of source code. The objective of this paper is to empirically examine whether or not there are effective threshold values for source code metrics. It is targeted to derive generalized thresholds that can be used in different software systems. The relationship between metric thresholds and fault-proneness was investigated empirically in this study by using ten open-source software systems. Three types of fault-proneness were defined for the software modules: non-fault-prone, more-than-one-fault-prone, and more-than-three-fault-prone. Two independent case studies were carried out to derive two different threshold values. A single set was created by merging ten datasets and was used as training data by the model. The learner model was created using logistic regression and the Bender method. Results revealed that some metrics have threshold effects. Seven metrics gave satisfactory results in the first case study. In the second case study, eleven metrics gave satisfactory results. This study makes contributions primarily for use by software developers and testers. Software developers can see classes or modules that require revising; this, consequently, contributes to an increment in quality for these modules and a decrement in their risk level. Testers can identify modules that need more testing effort and can prioritize modules according to their risk levels.  相似文献   

7.
Applying machine learning to software fault-proneness prediction   总被引:1,自引:0,他引:1  
The importance of software testing to quality assurance cannot be overemphasized. The estimation of a module’s fault-proneness is important for minimizing cost and improving the effectiveness of the software testing process. Unfortunately, no general technique for estimating software fault-proneness is available. The observed correlation between some software metrics and fault-proneness has resulted in a variety of predictive models based on multiple metrics. Much work has concentrated on how to select the software metrics that are most likely to indicate fault-proneness. In this paper, we propose the use of machine learning for this purpose. Specifically, given historical data on software metric values and number of reported errors, an Artificial Neural Network (ANN) is trained. Then, in order to determine the importance of each software metric in predicting fault-proneness, a sensitivity analysis is performed on the trained ANN. The software metrics that are deemed to be the most critical are then used as the basis of an ANN-based predictive model of a continuous measure of fault-proneness. We also view fault-proneness prediction as a binary classification task (i.e., a module can either contain errors or be error-free) and use Support Vector Machines (SVM) as a state-of-the-art classification method. We perform a comparative experimental study of the effectiveness of ANNs and SVMs on a data set obtained from NASA’s Metrics Data Program data repository.  相似文献   

8.
Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But, because open source software is often developed with a different management style than the industrial ones, the quality and reliability of the code needs to be studied. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about it. This paper describes how we calculated the object-oriented metrics given by Chidamber and Kemerer to illustrate how fault-proneness detection of the source code of the open source Web and e-mail suite called Mozilla can be carried out. We checked the values obtained against the number of bugs found in its bug database - called Bugzilla - using regression and machine learning methods to validate the usefulness of these metrics for fault-proneness prediction. We also compared the metrics of several versions of Mozilla to see how the predicted fault-proneness of the software system changed during its development cycle.  相似文献   

9.
We addresses the important problem of software quality analysis when there is limited software fault or fault-proneness data. A software quality model is typically trained using software measurement and fault data obtained from a previous release or similar project. Such an approach assumes that fault data is available for all the training modules. Various issues in software development may limit the availability of fault-proneness data for all the training modules. Consequently, the available labeled training dataset is such that the trained software quality model may not provide predictions. More specifically, the small set of modules with known fault-proneness labels is not sufficient for capturing the software quality trends of the project. We investigate semi-supervised learning with the Expectation Maximization (EM) algorithm for software quality estimation with limited fault-proneness data. The hypothesis is that knowledge stored in software attributes of the unlabeled program modules will aid in improving software quality estimation. Software data collected from a large NASA software project is used during the semi-supervised learning process. The software quality model is evaluated with multiple test datasets collected from other NASA software projects. Compared to software quality models trained only with the available set of labeled program modules, the EM-based semi-supervised learning scheme improves generalization performance of the software quality models.  相似文献   

10.
This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE).The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost.  相似文献   

11.
ContextSoftware defect prediction (SDP) is an important task in software engineering. Along with estimating the number of defects remaining in software systems and discovering defect associations, classifying the defect-proneness of software modules plays an important role in software defect prediction. Several machine-learning methods have been applied to handle the defect-proneness of software modules as a classification problem. This type of “yes” or “no” decision is an important drawback in the decision-making process and if not precise may lead to misclassifications. To the best of our knowledge, existing approaches rely on fully automated module classification and do not provide a way to incorporate extra knowledge during the classification process. This knowledge can be helpful in avoiding misclassifications in cases where system modules cannot be classified in a reliable way.ObjectiveWe seek to develop a SDP method that (i) incorporates a reject option in the classifier to improve the reliability in the decision-making process; and (ii) makes it possible postpone the final decision related to rejected modules for an expert analysis or even for another classifier using extra domain knowledge.MethodWe develop a SDP method called rejoELM and its variant, IrejoELM. Both methods were built upon the weighted extreme learning machine (ELM) with reject option that makes it possible postpone the final decision of non-classified modules, the rejected ones, to another moment. While rejoELM aims to maximize the accuracy for a rejection rate, IrejoELM maximizes the F-measure. Hence, IrejoELM becomes an alternative for classification with reject option for imbalanced datasets.ResultsrejoEM and IrejoELM are tested on five datasets of source code metrics extracted from real world open-source software projects. Results indicate that rejoELM has an accuracy for several rejection rates that is comparable to some state-of-the-art classifiers with reject option. Although IrejoELM shows lower accuracies for several rejection rates, it clearly outperforms all other methods when the F-measure is used as a performance metric.ConclusionIt is concluded that rejoELM is a valid alternative for classification with reject option problems when classes are nearly equally represented. On the other hand, IrejoELM is shown to be the best alternative for classification with reject option on imbalanced datasets. Since SDP problems are usually characterized as imbalanced learning problems, the use of IrejoELM is recommended.  相似文献   

12.
BackgroundSource code size in terms of SLOC (source lines of code) is the input of many parametric software effort estimation models. However, it is unavailable at the early phase of software development.ObjectiveWe investigate the accuracy of early SLOC estimation approaches for an object-oriented system using the information collected from its UML class diagram available at the early software development phase.MethodWe use different modeling techniques to build the prediction models for investigating the accuracy of six types of metrics to estimate SLOC. The used techniques include linear models, non-linear models, rule/tree-based models, and instance-based models. The investigated metrics are class diagram metrics, predictive object points, object-oriented project size metric, fast&&serious class points, objective class points, and object-oriented function points.ResultsBased on 100 open-source Java systems, we find that the prediction model built using object-oriented project size metric and ordinary least square regression with a logarithmic transformation achieves the highest accuracy (mean MMRE = 0.19 and mean Pred(25) = 0.74).ConclusionWe should use object-oriented project size metric and ordinary least square regression with a logarithmic transformation to build a simple, accurate, and comprehensible SLOC estimation model.  相似文献   

13.
ContextAlthough many papers have been published on software development and defect prediction techniques, problem reports in real projects quite often differ from those described in the literature. Hence, there is still a need for deeper exploration of case studies from industry.ObjectiveThe aim of this study is to present the impact of fine-grained problem reports on improving evaluation of testing and maintenance processes. It is targeted at projects involving several releases and complex schemes of problem handling. This is based on our experience gained while monitoring several commercial projects.MethodExtracting certain features from detailed problem reports, we derive various measures and present analysis models which characterize and visualize the effectiveness of testing and problem resolution processes. The considered reports describe types of problems (e.g. defects), their locations in project versions and software modules, ways of their resolution, etc. The performed analysis is related to eleven projects developed in the same company. This study is an exploratory research with some explanatory features. Moreover, having identified some drawbacks, we present extensions of problem reports and their analysis which have been verified in another industrial case study project.ResultsFine-grained (accurate) problem handling reports provide a wider scope of possible measures to assess the relevant development processes. This is helpful in controlling single projects (local perspective) as well as in managing these processes in the whole company (global perspective).ConclusionDetailed problem handling reports extend the space and quality of statistical analysis, they provide significant enhancement in evaluation and refinement of software development processes as well as in reliability prediction.  相似文献   

14.
Empirical validation of software metrics suites to predict fault proneness in object-oriented (OO) components is essential to ensure their practical use in industrial settings. In this paper, we empirically validate three OO metrics suites for their ability to predict software quality in terms of fault-proneness: the Chidamber and Kemerer (CK) metrics, Abreu's Metrics for Object-Oriented Design (MOOD), and Bansiya and Davis' Quality Metrics for Object-Oriented Design (QMOOD). Some CK class metrics have previously been shown to be good predictors of initial OO software quality. However, the other two suites have not been heavily validated except by their original proposers. Here, we explore the ability of these three metrics suites to predict fault-prone classes using defect data for six versions of Rhino, an open-source implementation of JavaScript written in Java. We conclude that the CK and QMOOD suites contain similar components and produce statistical models that are effective in detecting error-prone classes. We also conclude that the class components in the MOOD metrics suite are not good class fault-proneness predictors. Analyzing multivariate binary logistic regression models across six Rhino versions indicates these models may be useful in assessing quality in OO classes produced using modern highly iterative or agile software development processes.  相似文献   

15.
Context:How can quality of software systems be predicted before deployment? In attempting to answer this question, prediction models are advocated in several studies. The performance of such models drops dramatically, with very low accuracy, when they are used in new software development environments or in new circumstances.ObjectiveThe main objective of this work is to circumvent the model generalizability problem. We propose a new approach that substitutes traditional ways of building prediction models which use historical data and machine learning techniques.MethodIn this paper, existing models are decision trees built to predict module fault-proneness within the NASA Critical Mission Software. A genetic algorithm is developed to combine and adapt expertise extracted from existing models in order to derive a “composite” model that performs accurately in a given context of software development. Experimental evaluation of the approach is carried out in three different software development circumstances.ResultsThe results show that derived prediction models work more accurately not only for a particular state of a software organization but also for evolving and modified ones.ConclusionOur approach is considered suitable for software data nature and at the same time superior to model selection and data combination approaches. It is then concluded that learning from existing software models (i.e., software expertise) has two immediate advantages; circumventing model generalizability and alleviating the lack of data in software-engineering.  相似文献   

16.
随着规模和复杂性的迅猛膨胀,软件系统中不可避免地存在缺陷.近年来,基于深度学习的缺陷预测技术成为软件工程领域的研究热点.该类技术可以在不运行代码的情况下发现其中潜藏的缺陷,因而在工业界和学术界受到了广泛的关注.然而,已有方法大多关注方法级的源代码中是否存在缺陷,无法精确识别具体的缺陷类别,从而降低了开发人员进行缺陷定位及修复工作的效率.此外,在实际软件开发实践中,新的项目通常缺乏足够的缺陷数据来训练高精度的深度学习模型,而利用已有项目的历史数据训练好的模型往往在新项目上无法达到良好的泛化性能.因此,本文首先将传统的二分类缺陷预测任务表述为多标签分类问题,即使用CWE(common weakness enumeration)中描述的缺陷类别作为细粒度的模型预测标签.为了提高跨项目场景下的模型性能,本文提出一种融合对抗训练和注意力机制的多源域适应框架.具体而言,该框架通过对抗训练来减少域(即软件项目)差异,并进一步利用域不变特征来获得每个源域和目标域之间的特征相关性.同时,该框架还利用加权最大均值差异作为注意力机制以最小化源域和目标域特征之间的表示距离,从而使模型可以学习到更多的域无关特征.最后在八个真实世界的开源项目上与最先进的基线方法进行大量对比实验验证了所提方法的有效性.  相似文献   

17.
已有研究根据软件的代码依赖、修改历史、协同开发关系等,建立网络模型来预测软件的缺陷;近年来,网络嵌入技术广泛用于软件网络分析,显著提升了缺陷预测效果.本研究发现不同软件关联网络和网络嵌入算法的组合将影响缺陷预测性能.具体地,本文针对3种软件关联网络(类依赖网络、文件耦合网络和开发者贡献网络),并应用6类网络嵌入方法,分...  相似文献   

18.
软件缺陷预测技术用于定位软件中可能存在缺陷的代码模块,从而辅助开发人员进行测试与修复。传统的软件缺陷特征为基于软件规模、复杂度和语言特点等人工提取的静态度量元信息。然而,静态度量元特征无法直接捕捉程序上下文中的缺陷信息,从而影响了软件缺陷预测的性能。为了充分利用程序上下文中的语法语义信息,论文提出了一种基于混合注意力机制的软件缺陷预测方法 DP-MHA(Defect Prediction via Mixed Attention Mechanism)。DP-MHA首先从程序模块中提取基于AST树的语法语义序列并进行词嵌入编码和位置编码,然后基于多头注意力机制自学习上下文语法语义信息,最后利用全局注意力机制提取关键的语法语义特征,用于构建软件缺陷预测模型并识别存在潜在缺陷的代码模块。为了验证DP-MHA的有效性,论文选取了六个Apache的开源Java数据集,与经典的基于RF的静态度量元方法、基于RBM+RF、DBN+RF无监督学习方法和基于CNN和RNN深度学习方法进行对比,实验结果表明,DP-MHA在F1值分别提升了16.6%、34.3%、26.4%、7.1%、4.9%。  相似文献   

19.
System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software metrics that are input to the models. In this study, we consider 20 types of metrics to develop a model using an extreme learning machine associated with various kernel methods. We evaluate the effectiveness of the mode using a proposed framework based on the cost and efficiency in the testing phases. The evaluation process is carried out by considering case studies for 30 object-oriented software systems. Experimental results demonstrate that the application of a fault prediction model is suitable for projects with the percentage of faulty classes below a certain threshold, which depends on the efficiency of fault identification (low: 47.28%; median: 39.24%; high: 25.72%). We consider nine feature selection techniques to remove the irrelevant metrics and to select the best set of source code metrics for fault prediction.  相似文献   

20.
ContextSoftware metrics may be used in fault prediction models to improve software quality by predicting fault location.ObjectiveThis paper aims to identify software metrics and to assess their applicability in software fault prediction. We investigated the influence of context on metrics’ selection and performance.MethodThis systematic literature review includes 106 papers published between 1991 and 2011. The selected papers are classified according to metrics and context properties.ResultsObject-oriented metrics (49%) were used nearly twice as often compared to traditional source code metrics (27%) or process metrics (24%). Chidamber and Kemerer’s (CK) object-oriented metrics were most frequently used. According to the selected studies there are significant differences between the metrics used in fault prediction performance. Object-oriented and process metrics have been reported to be more successful in finding faults compared to traditional size and complexity metrics. Process metrics seem to be better at predicting post-release faults compared to any static code metrics.ConclusionMore studies should be performed on large industrial software systems to find metrics more relevant for the industry and to answer the question as to which metrics should be used in a given context.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号