首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The identification of a module's fault-proneness is very important for minimizing cost and improving the effectiveness of the software development process. How to obtain the correlation between software metrics and module's fault-proneness has been the focus of much research. This paper presents the application of hybrid artificial neural network (ANN) and Quantum Particle Swarm Optimization (QPSO) in software fault-proneness prediction. ANN is used for classifying software modules into fault-proneness or non fault-proneness categories, and QPSO is applied for reducing dimensionality. The experiment results show that the proposed prediction approach can establish the correlation between software metrics and modules’ fault-proneness, and is very simple because its implementation requires neither extra cost nor expert's knowledge. Proposed prediction approach can provide the potential software modules with fault-proneness to software developers, so developers only need to focus on these software modules, which may minimize effort and cost of software maintenance.  相似文献   

2.
Previous studies have demonstrated the relationship between coupling and external software quality attributes, such as fault-proneness, and the application of coupling to software maintenance tasks, such as impact analysis. These previous studies concentrate on class coupling. However, there is a growing focus on the study of features in software, and features are often implemented across multiple classes, meaning class-level coupling measures are not applicable. We ask the pertinent question, “Is measuring coupling at the feature-level also useful?” We define new feature coupling metrics based on structural and textual source code information and extend the unified framework for coupling measurement to include these new metrics. We also conduct three extensive case studies to evaluate these new metrics and answer this research question. The first study examines the relationship between feature coupling and fault-proneness, the second assesses feature coupling in the context of impact analysis, and the third study surveys developers to determine if the metrics align with what they consider to be coupled features. All three studies provide evidence that feature coupling metrics are indeed useful new measures that capture coupling at a higher level of abstraction than classes and can be useful for finding bugs, guiding testing effort, and assessing change impact.  相似文献   

3.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

4.
Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But, because open source software is often developed with a different management style than the industrial ones, the quality and reliability of the code needs to be studied. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about it. This paper describes how we calculated the object-oriented metrics given by Chidamber and Kemerer to illustrate how fault-proneness detection of the source code of the open source Web and e-mail suite called Mozilla can be carried out. We checked the values obtained against the number of bugs found in its bug database - called Bugzilla - using regression and machine learning methods to validate the usefulness of these metrics for fault-proneness prediction. We also compared the metrics of several versions of Mozilla to see how the predicted fault-proneness of the software system changed during its development cycle.  相似文献   

5.
Applying machine learning to software fault-proneness prediction   总被引:1,自引:0,他引:1  
The importance of software testing to quality assurance cannot be overemphasized. The estimation of a module’s fault-proneness is important for minimizing cost and improving the effectiveness of the software testing process. Unfortunately, no general technique for estimating software fault-proneness is available. The observed correlation between some software metrics and fault-proneness has resulted in a variety of predictive models based on multiple metrics. Much work has concentrated on how to select the software metrics that are most likely to indicate fault-proneness. In this paper, we propose the use of machine learning for this purpose. Specifically, given historical data on software metric values and number of reported errors, an Artificial Neural Network (ANN) is trained. Then, in order to determine the importance of each software metric in predicting fault-proneness, a sensitivity analysis is performed on the trained ANN. The software metrics that are deemed to be the most critical are then used as the basis of an ANN-based predictive model of a continuous measure of fault-proneness. We also view fault-proneness prediction as a binary classification task (i.e., a module can either contain errors or be error-free) and use Support Vector Machines (SVM) as a state-of-the-art classification method. We perform a comparative experimental study of the effectiveness of ANNs and SVMs on a data set obtained from NASA’s Metrics Data Program data repository.  相似文献   

6.
ContextIn a large object-oriented software system, packages play the role of modules which group related classes together to provide well-identified services to the rest of the system. In this context, it is widely believed that modularization has a large influence on the quality of packages. Recently, Sarkar, Kak, and Rama proposed a set of new metrics to characterize the modularization quality of packages from important perspectives such as inter-module call traffic, state access violations, fragile base-class design, programming to interface, and plugin pollution. These package-modularization metrics are quite different from traditional package-level metrics, which measure software quality mainly from size, extensibility, responsibility, independence, abstractness, and instability perspectives. As such, it is expected that these package-modularization metrics should be useful predictors for fault-proneness. However, little is currently known on their actual usefulness for fault-proneness prediction, especially compared with traditional package-level metrics.ObjectiveIn this paper, we examine the role of these new package-modularization metrics for determining software fault-proneness in object-oriented systems.MethodWe first use principal component analysis to analyze whether these new package-modularization metrics capture additional information compared with traditional package-level metrics. Second, we employ univariate prediction models to investigate how these new package-modularization metrics are related to fault-proneness. Finally, we build multivariate prediction models to examine the ability of these new package-modularization metrics for predicting fault-prone packages.ResultsOur results, based on six open-source object-oriented software systems, show that: (1) these new package-modularization metrics provide new and complementary views of software complexity compared with traditional package-level metrics; (2) most of these new package-modularization metrics have a significant association with fault-proneness in an expected direction; and (3) these new package-modularization metrics can substantially improve the effectiveness of fault-proneness prediction when used with traditional package-level metrics together.ConclusionsThe package-modularization metrics proposed by Sarkar, Kak, and Rama are useful for practitioners to develop quality software systems.  相似文献   

7.
Applying metrics to software is a way to measure and improve software quality. Many metrics apply to software implementations (code), so they cannot be used early in the life cycle. We survey ten modularity and structural complexity metrics applicable to software designs, and summarize the results of empirical validation studies when they are available. We present a database schema from which most of these metrics can be computed. Aspects of each metric require further study and refinement. However, using some of them during design may still be beneficial.  相似文献   

8.
随着软件的演化,软件规模和复杂性的上升往往会造成代码设计质量的退化,进而导致代码可维护性下降。已有大量软件度量指标用于量化代码设计质量,但是由于数量众多,不同指标体系度量结果可比性较差,使得开发人员难以找到存在设计质量问题的模块。系统调研并整理归类了现有的规模、耦合、内聚、封装、继承和多态等6方面软件度量指标。结合度量指标关系以及实验分析,从每个方面的指标集中发现了与代码设计质量关联度最高的6个指标,从而提出一种面向代码设计质量监控的软件度量指标集。实验表明,该指标集可以有效挑选出存在代码设计质量问题的类,可作为开发人员监控和定位代码设计问题的重点关注指标。  相似文献   

9.
Correcting software defects accounts for a significant amount of resources in a software project. To make best use of testing efforts, researchers have studied statistical models to predict in which parts of a software system future defects are likely to occur. By studying the mathematical relations between predictor variables used in these models, researchers can form an increased understanding of the important connections between development activities and software quality. Predictor variables used in past top-performing models are largely based on source code-oriented metrics, such as lines of code or number of changes. However, source code is the end product of numerous interlaced and collaborative activities carried out by developers. Traces of such activities can be found in the various repositories used to manage development efforts. In this paper, we develop statistical models to study the impact of social interactions in a software project on software quality. These models use predictor variables based on social information mined from the issue tracking and version control repositories of two large open-source software projects. The results of our case studies demonstrate the impact of metrics from four different dimensions of social interaction on post-release defects. Our findings show that statistical models based on social information have a similar degree of explanatory power as traditional models. Furthermore, our results demonstrate that social information does not substitute, but rather augments traditional source code-based metrics used in defect prediction models.  相似文献   

10.
Software quality models can predict which modules will have high risk, enabling developers to target enhancement activities to the most problematic modules. However, many find collection of the underlying software product and process metrics a daunting task.Many software development organizations routinely use very large databases for project management, configuration management, and problem reporting which record data on events during development. These large databases can be an unintrusive source of data for software quality modeling. However, multiplied by many releases of a legacy system or a broad product line, the amount of data can overwhelm manual analysis. The field of data mining is developing ways to find valuable bits of information in very large databases. This aptly describes our software quality modeling situation.This paper presents a case study that applied data mining techniques to software quality modeling of a very large legacy telecommunications software system's configuration management and problem reporting databases. The case study illustrates how useful models can be built and applied without interfering with development.  相似文献   

11.
This paper presents a case study of a software project in the maintenance phase. The case study was based on a sample of modules, representing about 1.3 million lines of code, from a very large telecommunications system. Software quality models were developed to predict the number of faults expected from the coding through operations phases. Since modules from the prior release were often reused to develop a new release, one model incorporated reuse data as additional independent variables. We compare this model's performance to a similar model without reuse data.Software quality models often have product metrics as the only input data for predicting quality. There is an implicit assumption that all the modules have had a similar development history, so that product attributes are the primary drivers of different quality levels. Reuse of software as components and software evolution do not fit this assumption very well, and consequently, traditional models for such environments may not have adequate accuracy. Focusing on the software maintenance phase, this study demonstrated that reuse data can significantly improve the predictive accuracy of software quality models.  相似文献   

12.
Most of the search-based software remodularization (SBSR) approaches designed to address the software remodularization problem (SRP) areutilizing only structural information-based coupling and cohesion quality criteria. However, in practice apart from these quality criteria, there require other aspects of coupling and cohesion quality criteria such as lexical and changed-history in designing the modules of the software systems. Therefore, consideration of limited aspects of software information in the SBSR may generate a sub-optimal modularization solution. Additionally, such modularization can be good from the quality metrics perspective but may not be acceptable to the developers. To produce a remodularization solution acceptable from both quality metrics and developers’ perspectives, this paper exploited more dimensions of software information to define the quality criteria as modularization objectives. Further, these objectives are simultaneously optimized using a tailored many-objective artificial bee colony (MaABC) to produce a remodularization solution. To assess the effectiveness of the proposed approach, we applied it over five software projects. The obtained remodularization solutions are evaluated with the software quality metrics and developers view of remodularization. Results demonstrate that the proposed software remodularization is an effective approach for generating good quality modularization solutions.  相似文献   

13.
ContextCode ownership metrics were recently defined in order to distinguish major and minor contributors of a software module, and to assess whether the ownership of such a module is strong or shared between developers.ObjectiveThe relationship between these metrics and software quality was initially validated on proprietary software projects. Our objective in this paper is to evaluate such relationship in open-source software projects, and to compare these metrics to other code and process metrics.MethodOn a newly crafted dataset of seven open-source software projects, we perform, using inferential statistics, an analysis of code ownership metrics and their relationship with software quality.ResultsWe confirm the existence of a relationship between code ownership and software quality, but the relative importance of ownership metrics in multiple linear regression models is low compared to metrics such as the number of lines of code, the number of modifications performed over the last release, or the number of developers of a module.ConclusionAlthough we do find a relationship between code ownership and software quality, the added value of ownership metrics compared to other metrics is still to be proven.  相似文献   

14.
Predicting the fault-proneness labels of software program modules is an emerging software quality assurance activity and the quality of datasets collected from previous software version affects the performance of fault prediction models. In this paper, we propose an outlier detection approach using metrics thresholds and class labels to identify class outliers. We evaluate our approach on public NASA datasets from PROMISE repository. Experiments reveal that this novel outlier detection method improves the performance of robust software fault prediction models based on Naive Bayes and Random Forests machine learning algorithms.  相似文献   

15.
Nowadays open-source software communities are thriving. Successful open-source projects are competitive and the amount of source code that is freely available offers great reuse opportunities to software developers. Thus, it is expected that several requirements can be implemented based on open source software reuse. Additionally, design patterns, i.e. well-known solution to common design problems, are introduced as elements of reuse. This study attempts to empirically investigate the reusability of design patterns, classes and software packages. Thus, the results can help developers to identify the most beneficial starting points for white box reuse, which is quite popular among open source communities. In order to achieve this goal we conducted a case study on one hundred (100) open source projects. More specifically, we identified 27,461 classes that participate in design patterns and compared the reusability of each of these classes with the reusability of the pattern and the package that this class belongs to. In more than 40% of the cases investigated, design pattern based class selection, offers the most reusable starting point for white-box reuse. However there are several cases when package based selection might be preferable. The results suggest that each pattern has different level of reusability.  相似文献   

16.
Software product quality can be enhanced significantly if we have a good knowledge and understanding of the potential faults therein. This paper describes a study to build predictive models to identify parts of the software that have high probability of occurrence of fault. We have considered the effect of thresholds of object‐oriented metrics on fault proneness and built predictive models based on the threshold values of the metrics used. Prediction of fault prone classes in earlier phases of software development life cycle will help software developers in allocating the resources efficiently. In this paper, we have used a statistical model derived from logistic regression to calculate the threshold values of object oriented, Chidamber and Kemerer metrics. Thresholds help developers to alarm the classes that fall outside a specified risk level. In this way, using the threshold values, we can divide the classes into two levels of risk – low risk and high risk. We have shown threshold effects at various risk levels and validated the use of these thresholds on a public domain, proprietary dataset, KC1 obtained from NASA and two open source, Promise datasets, IVY and JEdit using various machine learning methods and data mining classifiers. Interproject validation has also been carried out on three different open source datasets, Ant and Tomcat and Sakura. This will provide practitioners and researchers with well formed theories and generalised results. The results concluded that the proposed threshold methodology works well for the projects of similar nature or having similar characteristics.  相似文献   

17.
ContextRecently, network measures have been proposed to predict fault-prone modules. Leveraging the dependency relationships between software entities, network measures describe the structural features of software systems. However, there is no consensus about their effectiveness for fault-proneness prediction. Specifically, the predictive ability of network measures in effort-aware context has not been addressed.ObjectiveWe aim to provide a comprehensive evaluation on the predictive effectiveness of network measures with the effort needed to inspect the code taken into consideration.MethodWe first constructed software source code networks of 11 open-source projects by extracting the data and call dependencies between modules. We then employed univariate logistic regression to investigate how each single network measure was correlated with fault-proneness. Finally, we built multivariate prediction models to examine the usefulness of network measures under three prediction settings: cross-validation, across-release, and inter-project predictions. In particular, we used the effort-aware performance indicators to compare their predictive ability against the commonly used code metrics in both ranking and classification scenarios.ResultsBased on the 11 open-source software systems, our results show that: (1) most network measures are significantly positively related to fault-proneness; (2) the performance of network measures varies under different prediction settings; (3) network measures have inconsistent effects on various projects.ConclusionNetwork measures are of practical value in the context of effort-aware fault-proneness prediction, but researchers and practitioners should be careful of choosing whether and when to use network measures in practice.  相似文献   

18.
Building effective defect-prediction models in practice   总被引:1,自引:0,他引:1  
Koru  A.G. Liu  H. 《Software, IEEE》2005,22(6):23-29
Defective software modules cause software failures, increase development and maintenance costs, and decrease customer satisfaction. Effective defect prediction models can help developers focus quality assurance activities on defect-prone modules and thus improve software quality by using resources more efficiently. These models often use static measures obtained from source code, mainly size, coupling, cohesion, inheritance, and complexity measures, which have been associated with risk factors, such as defects and changes.  相似文献   

19.
Software developers insert logging statements in their source code to record important runtime information; such logged information is valuable for understanding system usage in production and debugging system failures. However, providing proper logging statements remains a manual and challenging task. Missing an important logging statement may increase the difficulty of debugging a system failure, while too much logging can increase system overhead and mask the truly important information. Intuitively, the actual functionality of a software component is one of the major drivers behind logging decisions. For instance, a method maintaining network communications is more likely to be logged than getters and setters. In this paper, we used automatically-computed topics of a code snippet to approximate the functionality of a code snippet. We studied the relationship between the topics of a code snippet and the likelihood of a code snippet being logged (i.e., to contain a logging statement). Our driving intuition is that certain topics in the source code are more likely to be logged than others. To validate our intuition, we conducted a case study on six open source systems, and we found that i) there exists a small number of “log-intensive” topics that are more likely to be logged than other topics; ii) each pair of the studied systems share 12% to 62% common topics, and the likelihood of logging such common topics has a statistically significant correlation of 0.35 to 0.62 among all the studied systems; and iii) our topic-based metrics help explain the likelihood of a code snippet being logged, providing an improvement of 3% to 13% on AUC and 6% to 16% on balanced accuracy over a set of baseline metrics that capture the structural information of a code snippet. Our findings highlight that topics contain valuable information that can help guide and drive developers’ logging decisions.  相似文献   

20.
Software quality involves the conformance of a software product to some predefined set of functional requirements at a specified level of quality. The software is considered valid when it conforms to these “quality factors” at some acceptable level. There are a large number of quality factors against which software may be validated. This paper discusses the development of traditional software metrics in relation to the anticipated structure of a software system. The taxonomy of a software system primarily relies upon the dissection of the software system into modules. Modular design is the cornerstone of quality software, and metrics that can predict an optimum modular structure are critical. By examining the theoretical bases on quality metrics, a base set of common quantitative metrics can be devised and mapped to quality metrics in which they reside. This paper surveys existing metrics and suggests the derivation of software design metrics from software quality factors. Measurable software attributes are identified and suggested as potential design metrics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号