首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
《Computer》1990,23(6):85-86
Guidelines for establishing a standard metrics program for organizations are presented. Initially, it is suggested that the collection effort be minimal, meaning the data to be processed should already be in the collection phase so that the total metrics effort will not be viewed as a burden; the raw metric data must be such that it can be processed automatically; the initial metrics effort should rely on computer programs that already exist in some basic form; and the metric must be viewed as worthwhile. Sources of data are discussed. The data needed for documentation metrics, source code metrics, problem-change report metrics, cost metrics, productivity metrics, and rework metrics are identified  相似文献   

2.
The desire to predict the effort in developing or explain the quality of software has led to the proposal of several metrics in the literature. As a step toward validating these metrics, the Software Engineering Laboratory has analyzed the Software Science metrics, cyclomatic complexity, and various standard program measures for their relation to 1) effort (including design through acceptance testing), 2) development errors (both discrete and weighted according to the amount of time to locate and frix), and 3) one another. The data investigated are collected from a production Fortran environment and examined across several projects at once, within individual projects and by individual programmers across projects, with three effort reporting accuracy checks demonstrating the need to validate a database. When the data come from individual programmers or certain validated projects, the metrics' correlations with actual effort seem to be strongest. For modules developed entirely by individual programmers, the validity ratios induce a statistically significant ordering of several of the metrics' correlations. When comparing the strongest correlations, neither Software Science's E metric, cyclomatic complexity nor source lines of code appears to relate convincingly better with effort than the others  相似文献   

3.
Software comprehension is one of the largest costs in the software lifecycle. In an attempt to control the cost of comprehension, various complexity metrics have been proposed to characterize the difficulty of understanding a program and, thus, allow accurate estimation of the cost of a change. Such metrics are not always evaluated. This paper evaluates a group of metrics recently proposed to assess the "spatial complexity" of a program (spatial complexity is informally defined as the distance a maintainer must move within source code to build a mental model of that code). The evaluation takes the form of a large-scale empirical study of evolving source code drawn from a commercial organization. The results of this investigation show that most of the spatial complexity metrics evaluated offer no substantially better information about program complexity than the number of lines of code. However, one metric shows more promise and is thus deemed to be a candidate for further use and investigation.  相似文献   

4.
As the cost of programming becomes a major component of the cost of computer systems, it becomes imperative that program development and maintenance be better managed. One measurement a manager could use is programming complexity. Such a measure can be very useful if the manager is confident that the higher the complexity measure is for a programming project, the more effort it takes to complete the project and perhaps to maintain it. Until recently most measures of complexity were based only on intuition and experience. In the past 3 years two objective metrics have been introduced, McCabe's cyclomatic number v(G) and Halstead's effort measure E. This paper reports an empirical study designed to compare these two metrics with a classic size measure, lines of code. A fourth metric based on a model of programming is introduced and shown to be better than the previously known metrics for some experimental data.  相似文献   

5.
The increasing cost of software maintenance has resulted in greater emphasis on the production of maintainable software. One method used to enhance the development of maintainable software is to employ complexity metrics as a technique for controlling software complexity. In order to control complexity, it is imperative to plan for increases in complexity levels from code generation to code implementation. This paper presents a study of complexity increases during the testing and debugging phases of the software life cycle. The metrics used to measure complexity are lines of code, unique operators, unique operands, data difficulty, Halstead's effort and cyclomatic complexity.  相似文献   

6.
Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.  相似文献   

7.
Sallie Henry  Roger Goff 《Software》1989,19(11):1065-1088
For many years the software engineering community has been attacking the software reliability problem on two fronts. First via design methodologies, languages and tools as a pre-check on complexity and secondly by measuring the complexity of produced software as a post-check. This research attempts to unify the approach to creating reliable software by providing the ability to measure the complexity of a design prior to its implementation. We have successfully defined and applied software metrics to graphical designs in an effort to predict software complexity early in the software life cycle. Metric values from the graphical design are input to predictor equations, provided in this paper, to give metric values for the resultant source code.  相似文献   

8.
Maintainers often face the daunting task of wading through a collection of both new and old revisions, trying to ferret out those that warrant detailed inspection. Perhaps the most obvious way to rank revisions is by lines of code (LOC); this technique has the advantage of being both simple and fast. However, most revisions are quite small, and so we would like a way of distinguishing between simple and complex changes of equal size. Classical complexity metrics, such as Halstead’s and McCabe’s, could be used but they are hard to apply to code fragments of different programming languages. We propose a language-independent approach to ranking revisions based on the indentation of their code fragments. We use the statistical moments of indentation as a lightweight and revision/diff friendly metric to proxy classical complexity metrics. We found that ranking revisions by the variance and summation of indentation was very similar to ranking revisions by traditional complexity measures since these measures correlate with both Halstead and McCabe complexity; this was evaluated against the CVS histories of 278 active and popular SourceForge projects. Thus, we conclude that measuring indentation alone can serve as a cheap and accurate proxy for computing the code complexity of revisions.  相似文献   

9.
Empirical validation of code metrics has a long history of success. Many metrics have been shown to be good predictors of external features, such as correlation to bugs. Our study provides an alternative explanation to such validation, attributing it to the confounding effect of size. In contradiction to received wisdom, we argue that the validity of a metric can be explained by its correlation to the size of the code artifact. In fact, this work came about in view of our failure in the quest of finding a metric that is both valid and free of this confounding effect. Our main discovery is that, with the appropriate (non-parametric) transformations, the validity of a metric can be accurately (with R-squared values being at times as high as 0.97) predicted from its correlation with size. The reported results are with respect to a suite of 26 metrics, that includes the famous Chidamber and Kemerer metrics. Concretely, it is shown that the more a metric is correlated with size, the more able it is to predict external features values, and vice-versa. We consider two methods for controlling for size, by linear transformations. As it turns out, metrics controlled for size, tend to eliminate their predictive capabilities. We also show that the famous Chidamber and Kemerer metrics are no better than other metrics in our suite. Overall, our results suggest code size is the only “unique” valid metric.  相似文献   

10.
Testing is the most widely adopted practice to ensure software quality. However, this activity is often a compromise between the available resources and software quality. In object-oriented development, testing effort should be focused on defective classes. Unfortunately, identifying those classes is a challenging and difficult activity on which many metrics, techniques, and models have been tried. In this paper, we investigate the usefulness of elementary design evolution metrics to identify defective classes. The metrics include the numbers of added, deleted, and modified attributes, methods, and relations. The metrics are used to recommend a ranked list of classes likely to contain defects for a system. They are compared to Chidamber and Kemerer’s metrics on several versions of Rhino and of ArgoUML. Further comparison is conducted with the complexity metrics computed by Zimmermann et al. on several releases of Eclipse. The comparisons are made according to three criteria: presence of defects, number of defects, and defect density in the top-ranked classes. They show that the design evolution metrics, when used in conjunction with known metrics, improve the identification of defective classes. In addition, they show that the design evolution metrics make significantly better predictions of defect density than other metrics and, thus, can help in reducing the testing effort by focusing test activity on a reduced volume of code.  相似文献   

11.
One purpose of software metrics is to measure the quality of programs. The results can be for example used to predict maintenance costs or improve code quality. An emerging view is that if software metrics are going to be used to improve quality, they must help in finding code that should be refactored. Often refactoring or applying a design pattern is related to the role of the class to be refactored. In client-based metrics, a project gives the class a context. These metrics measure how a class is used by other classes in the context. We present a new client-based metric LCIC (Lack of Coherence in Clients), which analyses if the class being measured has a coherent set of roles in the program. Interfaces represent the roles of classes. If a class does not have a coherent set of roles, it should be refactored, or a new interface should be defined for the class.We have implemented a tool for measuring the metric LCIC for Java projects in the Eclipse environment. We calculated LCIC values for classes of several open source projects. We compare these results with results of other related metrics, and inspect the measured classes to find out what kind of refactorings are needed. We also analyse the relation of different design patterns and refactorings to our metric. Our experiments reveal the usefulness of client-based metrics to improve the quality of code.  相似文献   

12.
Testability measures have been defined on flowgraphs modelling the control flow through a program. These measures attempt to quantify aspects of the structural complexity of code that might give useful information about the testing stage of software production. This paper shows how two such metrics, the Number of Trails metric and the Mask [k = 2] metric, can be calculated axiomatically.  相似文献   

13.
Databases are the core of Information Systems (IS). It is, therefore, necessary to ensure the quality of the databases in order to ensure the quality of the IS. Metrics are useful mechanisms for controlling database quality. This paper presents two metrics related to referential integrity, number of foreign keys (NFK) and depth of the referential tree (DRT) for controlling the quality of a relational database. However, to ascertain the practical utility of the metrics, experimental validation is necessary. This validation can be carried out through controlled experiments or through case studies. The controlled experiments must also be replicated in order to obtain firm conclusions. With this objective in mind, we have undertaken different empirical work with metrics for relational databases. As a part of this empirical work, we have conducted a case study with some metrics for relational databases and a controlled experiment with two metrics presented in this paper. The detailed experiment described in this paper is a replication of the later one. The experiment was replicated in order to confirm the results obtained from the first experiment.

As a result of all the experimental works, we can conclude that the NFK metric is a good indicator of relational database complexity. However, we cannot draw such firm conclusions regarding the DRT metric.  相似文献   


14.
The effect of object-oriented frameworks on developer productivity   总被引:1,自引:0,他引:1  
Moser  S. Nierstrasz  O. 《Computer》1996,29(9):45-51
  相似文献   

15.
This paper presents the results of a study of the software complexity characteristics of a large real-time signal processing system for which there is a 6-yr maintenance history. The objective of the study was to compare values generated by software metrics to the maintenance history in order to determine which software complexity metrics would be most useful for estimating maintenance effort. The metrics that were analyzed were program size measures, software science measures, and control flow measures. During the course of the study two new software metrics were defined. The new metrics, maximum knot depth and knots per jump ratio, are both extensions of the knot count metric. When comparing the metrics to the maintenance data the control flow measures showed the strongest positive correlation.  相似文献   

16.
The commenter objects to the claim made by the authors of the above-mentioned article (see ibid., vol.7, no.2, p.36-44, 1990) that Halstead's E (effort) metric has been extensively validated and to a lack of precise definition for units of `elementary mental discrimination.' With regard to the first point, S. Henry points out that since E seems to give an indication of error-prone software, it must be measuring some aspect of the source code, and that many studies have shown high correlations between effort and lines of code and McCabe's cyclomatic complexity. She agrees with the commenter's second point  相似文献   

17.
There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings.  相似文献   

18.
Studies which consider the extent to which the encapsulation of a class is weakened by direct access to its hidden members (such as through the use of the friend construct in C++) are scarce, and those that do exist are based on metric suites where the enabling mechanism of the coupling is ignored. This can lead to conclusions of limited construct validity where incorrect causes of coupling are suggested.In this paper a suite of software metrics which measure the amount of coupling enabled by different C++ programming language constructs (such as friendship and inheritance) are proposed. The metrics presented are based on a formal data model which can be easily adapted for other OO languages. This formal approach removes the scope for ambiguity in the metric definitions. These metrics provide a more accurate reflection of the causative agents of coupling in Object Oriented Systems and their utility is illustrated in an empirical study towards the end of the paper.  相似文献   

19.
A critique of software defect prediction models   总被引:4,自引:0,他引:4  
Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the “quality” of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of “software decomposition” in order to test hypotheses about defect introduction and help construct a better science of software engineering  相似文献   

20.
软件缺陷预测可帮助开发人员提前预测缺陷程序,合理分配有限的测试资源。软件缺陷预测的准确度不仅依赖于预测方法的选择,更依赖于软件的度量指标。因此,结合多元度量指标进行软件缺陷预测已成为当前的研究热点。从度量指标出发,对传统度量指标、多元度量指标以及结合多元度量指标的缺陷预测的研究进展进行了系统介绍。主要工作包含:介绍了传统的代码和过程度量指标、基于传统度量指标的软件缺陷预测模型以及影响数据质量的因素;阐述了语义结构度量指标;分析列举了当前用于软件缺陷预测的评价指标;结合预测粒度、传统度量指标、语义结构度量指标、跨项目软件缺陷预测对多元度量指标软件缺陷预测未来的研究趋势进行了展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号