首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The identification of a module's fault-proneness is very important for minimizing cost and improving the effectiveness of the software development process. How to obtain the correlation between software metrics and module's fault-proneness has been the focus of much research. This paper presents the application of hybrid artificial neural network (ANN) and Quantum Particle Swarm Optimization (QPSO) in software fault-proneness prediction. ANN is used for classifying software modules into fault-proneness or non fault-proneness categories, and QPSO is applied for reducing dimensionality. The experiment results show that the proposed prediction approach can establish the correlation between software metrics and modules’ fault-proneness, and is very simple because its implementation requires neither extra cost nor expert's knowledge. Proposed prediction approach can provide the potential software modules with fault-proneness to software developers, so developers only need to focus on these software modules, which may minimize effort and cost of software maintenance.  相似文献   

2.
Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.  相似文献   

3.
We addresses the important problem of software quality analysis when there is limited software fault or fault-proneness data. A software quality model is typically trained using software measurement and fault data obtained from a previous release or similar project. Such an approach assumes that fault data is available for all the training modules. Various issues in software development may limit the availability of fault-proneness data for all the training modules. Consequently, the available labeled training dataset is such that the trained software quality model may not provide predictions. More specifically, the small set of modules with known fault-proneness labels is not sufficient for capturing the software quality trends of the project. We investigate semi-supervised learning with the Expectation Maximization (EM) algorithm for software quality estimation with limited fault-proneness data. The hypothesis is that knowledge stored in software attributes of the unlabeled program modules will aid in improving software quality estimation. Software data collected from a large NASA software project is used during the semi-supervised learning process. The software quality model is evaluated with multiple test datasets collected from other NASA software projects. Compared to software quality models trained only with the available set of labeled program modules, the EM-based semi-supervised learning scheme improves generalization performance of the software quality models.  相似文献   

4.
ContextRecently, network measures have been proposed to predict fault-prone modules. Leveraging the dependency relationships between software entities, network measures describe the structural features of software systems. However, there is no consensus about their effectiveness for fault-proneness prediction. Specifically, the predictive ability of network measures in effort-aware context has not been addressed.ObjectiveWe aim to provide a comprehensive evaluation on the predictive effectiveness of network measures with the effort needed to inspect the code taken into consideration.MethodWe first constructed software source code networks of 11 open-source projects by extracting the data and call dependencies between modules. We then employed univariate logistic regression to investigate how each single network measure was correlated with fault-proneness. Finally, we built multivariate prediction models to examine the usefulness of network measures under three prediction settings: cross-validation, across-release, and inter-project predictions. In particular, we used the effort-aware performance indicators to compare their predictive ability against the commonly used code metrics in both ranking and classification scenarios.ResultsBased on the 11 open-source software systems, our results show that: (1) most network measures are significantly positively related to fault-proneness; (2) the performance of network measures varies under different prediction settings; (3) network measures have inconsistent effects on various projects.ConclusionNetwork measures are of practical value in the context of effort-aware fault-proneness prediction, but researchers and practitioners should be careful of choosing whether and when to use network measures in practice.  相似文献   

5.
Object-oriented metrics aim to exhibit the quality of source code and give insight to it quantitatively. Each metric assesses the code from a different aspect. There is a relationship between the quality level and the risk level of source code. The objective of this paper is to empirically examine whether or not there are effective threshold values for source code metrics. It is targeted to derive generalized thresholds that can be used in different software systems. The relationship between metric thresholds and fault-proneness was investigated empirically in this study by using ten open-source software systems. Three types of fault-proneness were defined for the software modules: non-fault-prone, more-than-one-fault-prone, and more-than-three-fault-prone. Two independent case studies were carried out to derive two different threshold values. A single set was created by merging ten datasets and was used as training data by the model. The learner model was created using logistic regression and the Bender method. Results revealed that some metrics have threshold effects. Seven metrics gave satisfactory results in the first case study. In the second case study, eleven metrics gave satisfactory results. This study makes contributions primarily for use by software developers and testers. Software developers can see classes or modules that require revising; this, consequently, contributes to an increment in quality for these modules and a decrement in their risk level. Testers can identify modules that need more testing effort and can prioritize modules according to their risk levels.  相似文献   

6.
ContextThe software defect prediction during software development has recently attracted the attention of many researchers. The software defect density indicator prediction in each phase of software development life cycle (SDLC) is desirable for developing a reliable software product. Software defect prediction at the end of testing phase may not be more beneficial because the changes need to be performed in the previous phases of SDLC may require huge amount of money and effort to be spent in order to achieve target software quality. Therefore, phase-wise software defect density indicator prediction model is of great importance.ObjectiveIn this paper, a fuzzy logic based phase-wise software defect prediction model is proposed using the top most reliability relevant metrics of the each phase of the SDLC.MethodIn the proposed model, defect density indicator in requirement analysis, design, coding and testing phase is predicted using nine software metrics of these four phases. The defect density indicator metric predicted at the end of the each phase is also taken as an input to the next phase. Software metrics are assessed in linguistic terms and fuzzy inference system has been employed to develop the model.ResultsThe predictive accuracy of the proposed model is validated using twenty real software project data. Validation results are satisfactory. Measures based on the mean magnitude of relative error and balanced mean magnitude of relative error decrease significantly as the software project size increases.ConclusionIn this paper, a fuzzy logic based model is proposed for predicting software defect density indicator at each phase of the SDLC. The predicted defects of twenty different software projects are found very near to the actual defects detected during testing. The predicted defect density indicators are very helpful to analyze the defect severity in different artifacts of SDLC of a software project.  相似文献   

7.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

8.
ContextIn a large object-oriented software system, packages play the role of modules which group related classes together to provide well-identified services to the rest of the system. In this context, it is widely believed that modularization has a large influence on the quality of packages. Recently, Sarkar, Kak, and Rama proposed a set of new metrics to characterize the modularization quality of packages from important perspectives such as inter-module call traffic, state access violations, fragile base-class design, programming to interface, and plugin pollution. These package-modularization metrics are quite different from traditional package-level metrics, which measure software quality mainly from size, extensibility, responsibility, independence, abstractness, and instability perspectives. As such, it is expected that these package-modularization metrics should be useful predictors for fault-proneness. However, little is currently known on their actual usefulness for fault-proneness prediction, especially compared with traditional package-level metrics.ObjectiveIn this paper, we examine the role of these new package-modularization metrics for determining software fault-proneness in object-oriented systems.MethodWe first use principal component analysis to analyze whether these new package-modularization metrics capture additional information compared with traditional package-level metrics. Second, we employ univariate prediction models to investigate how these new package-modularization metrics are related to fault-proneness. Finally, we build multivariate prediction models to examine the ability of these new package-modularization metrics for predicting fault-prone packages.ResultsOur results, based on six open-source object-oriented software systems, show that: (1) these new package-modularization metrics provide new and complementary views of software complexity compared with traditional package-level metrics; (2) most of these new package-modularization metrics have a significant association with fault-proneness in an expected direction; and (3) these new package-modularization metrics can substantially improve the effectiveness of fault-proneness prediction when used with traditional package-level metrics together.ConclusionsThe package-modularization metrics proposed by Sarkar, Kak, and Rama are useful for practitioners to develop quality software systems.  相似文献   

9.
This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE).The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost.  相似文献   

10.
In this study, defect tracking is used as a proxy method to predict software readiness. The number of remaining defects in an application under development is one of the most important factors that allow one to decide if a piece of software is ready to be released. By comparing predicted number of faults and number of faults discovered in testing, software manager can decide whether the software is likely ready to be released or not.The predictive model developed in this research can predict: (i) the number of faults (defects) likely to exist, (ii) the estimated number of code changes required to correct a fault and (iii) the estimated amount of time (in minutes) needed to make the changes in respective classes of the application. The model uses product metrics as independent variables to do predictions. These metrics are selected depending on the nature of source code with regards to architecture layers, types of faults and contribution factors of these metrics. The use of neural network model with genetic training strategy is introduced to improve prediction results for estimating software readiness in this study. This genetic-net combines a genetic algorithm with a statistical estimator to produce a model which also shows the usefulness of inputs.The model is divided into three parts: (1) prediction model for presentation logic tier (2) prediction model for business tier and (3) prediction model for data access tier. Existing object-oriented metrics and complexity software metrics are used in the business tier prediction model. New sets of metrics have been proposed for the presentation logic tier and data access tier. These metrics are validated using data extracted from real world applications. The trained models can be used as tools to assist software mangers in making software release decisions.  相似文献   

11.
Testability measures have been defined on flowgraphs modelling the control flow through a program. These measures attempt to quantify aspects of the structural complexity of code that might give useful information about the testing stage of software production. This paper shows how two such metrics, the Number of Trails metric and the Mask [k = 2] metric, can be calculated axiomatically.  相似文献   

12.
过程中的跟踪和度量在软件开发中扮演着重要的角色,尤其是对软件测试。本文讨论了若干测试过程中度量方法,对每种度量法进行了介绍,并结合实测图示数据加以说明。最后对在质量管理中怎样使用过程度量方法给出了一些建议。  相似文献   

13.
过程中的跟踪和度量在软件开发中扮演着重要的角色,尤其是对软件测试。本文讨论了若干测试过程中度量方法,对每种度量法进行了介绍,并结合实测图示数据加以说明。最后对在质量管理中怎样使用过程度量方法给出了一些建议。  相似文献   

14.
In the last decade, empirical studies on object-oriented design metrics have shown some of them to be useful for predicting the fault-proneness of classes in object-oriented software systems. This research did not, however, distinguish among faults according to the severity of impact. It would be valuable to know how object-oriented design metrics and class fault-proneness are related when fault severity is taken into account. In this paper, we use logistic regression and machine learning methods to empirically investigate the usefulness of object-oriented design metrics, specifically, a subset of the Chidamber and Kemerer suite, in predicting fault-proneness when taking fault severity into account. Our results, based on a public domain NASA data set, indicate that 1) most of these design metrics are statistically related to fault-proneness of classes across fault severity, and 2) the prediction capabilities of the investigated metrics greatly depend on the severity of faults. More specifically, these design metrics are able to predict low severity faults in fault-prone classes better than high severity faults in fault-prone classes  相似文献   

15.
Ontology languages such as OWL are being widely used as the Semantic Web movement gains momentum. With the proliferation of the Semantic Web, more and more large-scale ontologies are being developed in real-world applications to represent and integrate knowledge and data. There is an increasing need for measuring the complexity of these ontologies in order for people to better understand, maintain, reuse and integrate them. In this paper, inspired by the concept of software metrics, we propose a suite of ontology metrics, at both the ontology-level and class-level, to measure the design complexity of ontologies. The proposed metrics are analytically evaluated against Weyuker’s criteria. We have also performed empirical analysis on public domain ontologies to show the characteristics and usefulness of the metrics. We point out possible applications of the proposed metrics to ontology quality control. We believe that the proposed metric suite is useful for managing ontology development projects.  相似文献   

16.
This paper presents a case history of Mentor Graphics using a set of quality metrics to track development progress for a recent major software release. It provides background on how Mentor Graphics originally began using software metrics to measure product quality, how this became accepted, and how these metrics later fell out of favour. To restore these metrics to effective use, process changes were required for setting quality and metric targets, and for the way the metrics are used for tracking development progress. With these process changes in place, and the addition of a new metric, the case history demonstrates that the metric set could be used effectively to indicate problems in this release and help manage changes to the plan for completion of the release. The lessons learned in this case history are presented, along with subsequent data that further validates these metrics.  相似文献   

17.
ContextSoftware defect prediction plays a crucial role in estimating the most defect-prone components of software, and a large number of studies have pursued improving prediction accuracy within a project or across projects. However, the rules for making an appropriate decision between within- and cross-project defect prediction when available historical data are insufficient remain unclear.ObjectiveThe objective of this work is to validate the feasibility of the predictor built with a simplified metric set for software defect prediction in different scenarios, and to investigate practical guidelines for the choice of training data, classifier and metric subset of a given project.MethodFirst, based on six typical classifiers, three types of predictors using the size of software metric set were constructed in three scenarios. Then, we validated the acceptable performance of the predictor based on Top-k metrics in terms of statistical methods. Finally, we attempted to minimize the Top-k metric subset by removing redundant metrics, and we tested the stability of such a minimum metric subset with one-way ANOVA tests.ResultsThe study has been conducted on 34 releases of 10 open-source projects available at the PROMISE repository. The findings indicate that the predictors built with either Top-k metrics or the minimum metric subset can provide an acceptable result compared with benchmark predictors. The guideline for choosing a suitable simplified metric set in different scenarios is presented in Table 12.ConclusionThe experimental results indicate that (1) the choice of training data for defect prediction should depend on the specific requirement of accuracy; (2) the predictor built with a simplified metric set works well and is very useful in case limited resources are supplied; (3) simple classifiers (e.g., Naïve Bayes) also tend to perform well when using a simplified metric set for defect prediction; and (4) in several cases, the minimum metric subset can be identified to facilitate the procedure of general defect prediction with acceptable loss of prediction precision in practice.  相似文献   

18.
Previous studies have demonstrated the relationship between coupling and external software quality attributes, such as fault-proneness, and the application of coupling to software maintenance tasks, such as impact analysis. These previous studies concentrate on class coupling. However, there is a growing focus on the study of features in software, and features are often implemented across multiple classes, meaning class-level coupling measures are not applicable. We ask the pertinent question, “Is measuring coupling at the feature-level also useful?” We define new feature coupling metrics based on structural and textual source code information and extend the unified framework for coupling measurement to include these new metrics. We also conduct three extensive case studies to evaluate these new metrics and answer this research question. The first study examines the relationship between feature coupling and fault-proneness, the second assesses feature coupling in the context of impact analysis, and the third study surveys developers to determine if the metrics align with what they consider to be coupled features. All three studies provide evidence that feature coupling metrics are indeed useful new measures that capture coupling at a higher level of abstraction than classes and can be useful for finding bugs, guiding testing effort, and assessing change impact.  相似文献   

19.
This paper proposes an artificial neural network (ANN) based software reliability model trained by novel particle swarm optimization (PSO) algorithm for enhanced forecasting of the reliability of software. The proposed ANN is developed considering the fault generation phenomenon during software testing with the fault complexity of different levels. We demonstrate the proposed model considering three types of faults residing in the software. We propose a neighborhood based fuzzy PSO algorithm for competent learning of the proposed ANN using software failure data. Fitting and prediction performances of the neighborhood fuzzy PSO based proposed neural network model are compared with the standard PSO based proposed neural network model and existing ANN based software reliability models in the literature through three real software failure data sets. We also compare the performance of the proposed PSO algorithm with the standard PSO algorithm through learning of the proposed ANN. Statistical analysis shows that the neighborhood fuzzy PSO based proposed neural network model has comparatively better fitting and predictive ability than the standard PSO based proposed neural network model and other ANN based software reliability models. Faster release of software is achievable by applying the proposed PSO based neural network model during the testing period.   相似文献   

20.
We present a coverage metric targeted at shared-memory concurrent programs: the Location Pairs (LP) coverage metric. The goals of this metric are (i) to measure how thoroughly a program has been tested from a concurrency standpoint, i.e., whether enough qualitatively different thread interleavings have been explored, and (ii) to guide testing towards unexplored concurrency scenarios. This metric was inspired by an access pattern known to lead to high-level concurrency errors in industrial software and in the literature. We built a monitoring tool to measure LP coverage of test programs. We used the LP metric for interactive debugging, and compared LP coverage with other concurrency coverage metrics on Java benchmarks. We demonstrated that LP coverage corresponds better to concurrency errors, is a better measure of how well a program is exercised concurrency-wise by a test set, reaches saturation later than other coverage metrics, and is viable and useful as an interactive testing and debugging tool.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号