期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Empirical comparison of three metrics suites for fault prediction in packages of object-oriented systems: A case study of Eclipse

Mahmoud O. Elish Ali H. Al-Yafei 《Advances in Engineering Software》2011,42(10):852-859

Packages are important high-level organizational units for large object-oriented systems. Package-level metrics characterize the attributes of packages such as size, complexity, and coupling. There is a need for empirical evidence to support the collection of these metrics and using them as early indicators of some important external software quality attributes. In this paper, three suites of package-level metrics (Martin, MOOD and CK) are evaluated and compared empirically in predicting the number of pre-release faults and the number of post-release faults in packages. Eclipse, one of the largest open source systems, is used as a case study. The results indicate that the prediction models that are based on Martin suite are more accurate than those that are based on MOOD and CK suites across releases of Eclipse. 相似文献

2.

The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process

Raed Shatnawi Author Vitae Wei Li^{Author Vitae} 《Journal of Systems and Software》2008,81(11):1868-1882

Many empirical studies have found that software metrics can predict class error proneness and the prediction can be used to accurately group error-prone classes. Recent empirical studies have used open source systems. These studies, however, focused on the relationship between software metrics and class error proneness during the development phase of software projects. Whether software metrics can still predict class error proneness in a system’s post-release evolution is still a question to be answered. This study examined three releases of the Eclipse project and found that although some metrics can still predict class error proneness in three error-severity categories, the accuracy of the prediction decreased from release to release. Furthermore, we found that the prediction cannot be used to build a metrics model to identify error-prone classes with acceptable accuracy. These findings suggest that as a system evolves, the use of some commonly used metrics to identify which classes are more prone to errors becomes increasingly difficult and we should seek alternative methods (to the metric-prediction models) to locate error-prone classes if we want high accuracy. 相似文献

3.

Analyzing software measurement data with clustering techniques 总被引：1，自引：0，他引：1

Zhong S. Khoshgoftaar T.M. Seliya N. 《Intelligent Systems, IEEE》2004,19(2):20-27

For software quality estimation, software development practitioners typically construct quality-classification or fault prediction models using software metrics and fault data from a previous system release or a similar software project. Engineers then use these models to predict the fault proneness of software modules in development. Software quality estimation using supervised-learning approaches is difficult without software fault measurement data from similar projects or earlier system releases. Cluster analysis with expert input is a viable unsupervised-learning solution for predicting software modules' fault proneness and potential noisy modules. Data analysts and software engineering experts can collaborate more closely to construct and collect more informative software metrics. 相似文献

4.

A model-driven traceability framework for software product lines 总被引：1，自引：0，他引：1

Nicolas Anquetil Uirá Kulesza Ralf Mitschke Ana Moreira Jean-Claude Royer Andreas Rummler André Sousa 《Software and Systems Modeling》2010,9(4):427-451

Software product line (SPL) engineering is a recent approach to software development where a set of software products are derived for a well defined target application domain, from a common set of core assets using analogous means of production (for instance, through Model Driven Engineering). Therefore, such family of products are built from reuse, instead of developed individually from scratch. SPL promise to lower the costs of development, increase the quality of software, give clients more flexibility and reduce time to market. These benefits come with a set of new problems and turn some older problems possibly more complex. One of these problems is traceability management. In the European AMPLE project we are creating a common traceability framework across the various activities of the SPL development. We identified four orthogonal traceability dimensions in SPL development, one of which is an extension of what is often considered as “traceability of variability”. This constitutes one of the two contributions of this paper. The second contribution is the specification of a metamodel for a repository of traceability links in the context of SPL and the implementation of a respective traceability framework. This framework enables fundamental traceability management operations, such as trace import and export, modification, query and visualization. The power of our framework is highlighted with an example scenario. 相似文献

5.

Predicting the location and number of faults in large software systems 总被引：6，自引：0，他引：6

Ostrand T.J. Weyuker E.J. Bell R.M. 《IEEE transactions on pattern analysis and machine intelligence》2005,31(4):340-355

Advance knowledge of which files in the next release of a large software system are most likely to contain the largest numbers of faults can be a very valuable asset. To accomplish this, a negative binomial regression model has been developed and used to predict the expected number of faults in each file of the next release of a system. The predictions are based on the code of the file in the current release, and fault and modification history of the file from previous releases. The model has been applied to two large industrial systems, one with a history of 17 consecutive quarterly releases over 4 years, and the other with nine releases over 2 years. The predictions were quite accurate: for each release of the two systems, the 20 percent of the files with the highest predicted number of faults contained between 71 percent and 92 percent of the faults that were actually detected, with the overall average being 83 percent. The same model was also used to predict which files of the first system were likely to have the highest fault densities (faults per KLOC). In this case, the 20 percent of the files with the highest predicted fault densities contained an average of 62 percent of the system's detected faults. However, the identified files contained a much smaller percentage of the code mass than the files selected to maximize the numbers of faults. The model was also used to make predictions from a much smaller input set that only contained fault data from integration testing and later. The prediction was again very accurate, identifying files that contained from 71 percent to 93 percent of the faults, with the average being 84 percent. Finally, a highly simplified version of the predictor selected files containing, on average, 73 percent and 74 percent of the faults for the two systems. 相似文献

6.

针对基于变异错误定位的一种动态变异执行策略

龚沛耿楚瑶郭俊霞赵瑞莲《计算机科学》2016,43(2):199-203, 229

在软件调试过程中,如何快速、精确地定位程序中的错误代码是软件开发人员普遍关注的问题。基于变异的错误定位方法是一种通过分析被测程序与程序变异体之间的行为相似性来估计语句出错概率、进行错误定位的方法。该方法有较高的错误定位精确度,但由于需对大量程序变异体执行测试用例集,因此其变异执行开销较大。为此提出了一种动态变异执行策略,它通过搜集测试用例执行信息,动态地调整变异体及测试用例的执行顺序,以减少其变异执行开销。实验结果表明,在6个程序包的127个错误版本上,应用提出的动态变异执行策略可在保证错误定位精确度的前提下,减少23%~78%的变异执行开销,显著提高了基于变异的错误定位方法的效率。相似文献

7.

Accuracy of software quality models over multiple releases

Taghi M. Khoshgoftaar Edward B. Allen Wendell D. Jones John P. Hudepohl 《Annals of Software Engineering》2000,9(1-2):103-116

Many evolving mission‐critical systems must have high software reliability. However, it is often difficult to identify fault‐prone modules early enough in a development cycle to guide software enhancement efforts effectively and efficiently. Software quality models can yield timely predictions of membership in the fault‐prone class on a module‐by‐module basis, enabling one to target enhancement techniques. However, it is an open empirical question, “Can a software quality model remain useful over several releases?” Most prior software quality studies have examined only one release of a system, evaluating the model with modules from the same release. We conducted a case study of a large legacy telecommunications system where measurements on one software release were used to build models, and three subsequent releases of the same system were used to evaluate model accuracy. This is a realistic assessment of model accuracy, closely simulating actual use of a software quality model. A module was considered fault‐prone if any of its faults were discovered by customers. These faults are extremely expensive due to consequent loss of service and emergency repair efforts. We found that the model maintained useful accuracy over several releases. These findings are initial empirical evidence that software quality models can remain useful as a system is maintained by a stable software development process. 相似文献

8.

Evaluating scenario-based SPL requirements approaches: the case for modularity,stability and expressiveness

Mauricio Alférez Rodrigo Bonifácio Leopoldo Teixeira Paola Accioly Uirá Kulesza Ana Moreira João Araújo Paulo Borba 《Requirements Engineering》2014,19(4):355-376

Software product lines (SPL) provide support for productivity gains through systematic reuse. Among the various quality attributes supporting these goals, modularity, stability and expressiveness of feature specifications, their composition and configuration knowledge emerge as strategic values in modern software development paradigms. This paper presents a metric-based evaluation aiming at assessing how well the chosen qualities are supported by scenario-based SPL requirements approaches. The selected approaches for this study span from type of notation (textual or graphical based), style to support variability (annotation or composition based), and specification expressiveness. They are compared using the metrics developed in a set of releases from an exemplar case study. Our major findings indicate that composition-based approaches have greater potential to support modularity and stability, and that quantification mechanisms simplify and increase expressiveness of configuration knowledge and composition specifications. 相似文献

9.

面向语句的MBFL变异体约减策略

王林鑫王微微赵瑞莲李征《计算机科学》2017,44(11):175-180

在软件调试过程中如何高效、精确地定位程序中的错误代码是软件开发人员普遍关注的问题。MBFL是一种基于变异分析的错误定位技术,它在获得较高错误定位精度的同时会生成大量变异体,并在变异体上执行测试用例集,开销庞大。为了减少MBFL的变异执行开销,提出面向语句的变异体约减策略,通过分析测试用例的执行信息, 按一定比例对每条由失败测试用例覆盖的语句生成的变异体集合进行约减。实验结果表明,在7个程序包的112个错误版本上,应用面向语句的变异体约减策略的MBFL,在保持较高错误定位精度的同时,能够有效减少73.51%~79.98%的变异执行开销。相似文献

10.

Assessing the applicability of fault-proneness models across object-oriented software projects 总被引：1，自引：0，他引：1

Briand L.C. Melo W.L. Wust J. 《IEEE transactions on pattern analysis and machine intelligence》2002,28(7):706-720

A number of papers have investigated the relationships between design metrics and the detection of faults in object-oriented software. Several of these studies have shown that such models can be accurate in predicting faulty classes within one particular software product. In practice, however, prediction models are built on certain products to be used on subsequent software development projects. How accurate can these models be, considering the inevitable differences that may exist across projects and systems? Organizations typically learn and change. From a more general standpoint, can we obtain any evidence that such models are economically viable tools to focus validation and verification effort? This paper attempts to answer these questions by devising a general but tailorable cost-benefit model and by using fault and design data collected on two mid-size Java systems developed in the same environment. Another contribution of the paper is the use of a novel exploratory analysis technique - MARS (multivariate adaptive regression splines) to build such fault-proneness models, whose functional form is a-priori unknown. The results indicate that a model built on one system can be accurately used to rank classes within another system according to their fault proneness. The downside, however, is that, because of system differences, the predicted fault probabilities are not representative of the system predicted. However, our cost-benefit model demonstrates that the MARS fault-proneness model is potentially viable, from an economical standpoint. The linear model is not nearly as good, thus suggesting a more complex model is required. 相似文献

11.

基于χChek的软件产品线多值模型检测方法

黄鸣宇石玉峰《计算机与现代化》2014,(8):87-90

软件产品线保持产品个性化的同时提高了公共部分的复用。但软件产品线中包含的不确定信息,给产品带来了潜在风险。形式化验证技术逐步应用于软件产品线验证。但是传统的布尔逻辑模型不能很好地描述软件产品线的不确定性和不一致性。本文结合多值模型检测器χChek,通过基于动作的模型描述方法,对软件产品线进行描述,然后转换成为χChek规定的模型格式,同时提供多值逻辑描述。最后采用计算树逻辑描述产品线属性,使用χChek进行验证。相似文献

12.

Assessing software product line potential: an exploratory industrial case study

Heiko Koziolek Thomas Goldschmidt Thijmen de Gooijer Dominik Domis Stephan Sehestedt Thomas Gamer Markus Aleksy 《Empirical Software Engineering》2016,21(2):411-448

Corporate organizations sometimes offer similar software products in certain domains due to former company mergers or due to the complexity of the organization. The functional overlap of such products is an opportunity for future systematic reuse to reduce software development and maintenance costs. Therefore, we have tailored existing domain analysis methods to our organization to identify commonalities and variabilities among such products and to assess the potential for software product line (SPL) approaches. As an exploratory case study, we report on our experiences and lessons learned from conducting the domain analysis in four application cases with large-scale software products. We learned that the outcome of a domain analysis was often a smaller integration scenario instead of an SPL and that business case calculations were less relevant for the stakeholders and managers from the business units during this phase. We also learned that architecture reconstruction using a simple block diagram notation aids domain analysis and that large parts of our approach were reusable across application cases. 相似文献

13.

The limited impact of individual developer data on software defect prediction

Robert M. Bell Thomas J. Ostrand Elaine J. Weyuker 《Empirical Software Engineering》2013,18(3):478-505

Previous research has provided evidence that a combination of static code metrics and software history metrics can be used to predict with surprising success which files in the next release of a large system will have the largest numbers of defects. In contrast, very little research exists to indicate whether information about individual developers can profitably be used to improve predictions. We investigate whether files in a large system that are modified by an individual developer consistently contain either more or fewer faults than the average of all files in the system. The goal of the investigation is to determine whether information about which particular developer modified a file is able to improve defect predictions. We also extend earlier research evaluating use of counts of the number of developers who modified a file as predictors of the file’s future faultiness. We analyze change reports filed for three large systems, each containing 18 releases, with a combined total of nearly 4 million LOC and over 11,000 files. A buggy file ratio is defined for programmers, measuring the proportion of faulty files in Release R out of all files modified by the programmer in Release R-1. We assess the consistency of the buggy file ratio across releases for individual programmers both visually and within the context of a fault prediction model. Buggy file ratios for individual programmers often varied widely across all the releases that they participated in. A prediction model that takes account of the history of faulty files that were changed by individual developers shows improvement over the standard negative binomial model of less than 0.13% according to one measure, and no improvement at all according to another measure. In contrast, augmenting a standard model with counts of cumulative developers changing files in prior releases produced up to a 2% improvement in the percentage of faults detected in the top 20% of predicted faulty files. The cumulative number of developers interacting with a file can be a useful variable for defect prediction. However, the study indicates that adding information to a model about which particular developer modified a file is not likely to improve defect predictions. 相似文献

14.

Deriving a Fault Architecture to Guide Testing

Stringfellow C. Andrews A. 《Software Quality Journal》2002,10(4):299-330

Defect analysis of software components can be used to guide testing, with the goal of focusing on parts of the software that were fault-prone in earlier releases or earlier life cycle phases, such as development. We replicate a study that adapted a reverse architecting technique using defect reports to derive fault architectures. A fault architecture determines and visualizes components that are fault-prone in their relationships with other components, as well as those that are locally fault-prone. Our case study uses defect data from three releases of a large medical record system to identify relationships among system components, based on whether they are involved in the same defect report.We investigate measures that assess the fault-proneness of components and component relationships. Component relationships are used to derive a fault architecture. The resulting fault architecture indicates what the most fault-prone relationships are in a release. We also apply the technique in a new way. Not only do we derive fault architectures for each release, we derive fault architectures for the development, system test and post release phases within each release. Comparing across releases, makes it possible to see whether some components are repeatedly in fault-prone relationships. Comparing across phases, makes it possible to see whether development fault architectures can be used to identify those parts of the software that need to be tested more. We validate our predictions using system test data from the same release. We also use the development and system test fault architectures to identify fault-prone components after release, and validate our predictions using post release data. 相似文献

15.

A systematic review of quality attributes and measures for software product lines

Sonia Montagud Silvia Abrah?o Emilio Insfran 《Software Quality Journal》2012,20(3-4):425-486

It is widely accepted that software measures provide an appropriate mechanism for understanding, monitoring, controlling, and predicting the quality of software development projects. In software product lines (SPL), quality is even more important than in a single software product since, owing to systematic reuse, a fault or an inadequate design decision could be propagated to several products in the family. Over the last few years, a great number of quality attributes and measures for assessing the quality of SPL have been reported in literature. However, no studies summarizing the current knowledge about them exist. This paper presents a systematic literature review with the objective of identifying and interpreting all the available studies from 1996 to 2010 that present quality attributes and/or measures for SPL. These attributes and measures have been classified using a set of criteria that includes the life cycle phase in which the measures are applied; the corresponding quality characteristics; their support for specific SPL characteristics (e.g., variability, compositionality); the procedure used to validate the measures, etc. We found 165 measures related to 97 different quality attributes. The results of the review indicated that 92% of the measures evaluate attributes that are related to maintainability. In addition, 67% of the measures are used during the design phase of Domain Engineering, and 56% are applied to evaluate the product line architecture. However, only 25% of them have been empirically validated. In conclusion, the results provide a global vision of the state of the research within this area in order to help researchers in detecting weaknesses, directing research efforts, and identifying new research lines. In particular, there is a need for new measures with which to evaluate both the quality of the artifacts produced during the entire SPL life cycle and other quality characteristics. There is also a need for more validation (both theoretical and empirical) of existing measures. In addition, our results may be useful as a reference guide for practitioners to assist them in the selection or the adaptation of existing measures for evaluating their software product lines. 相似文献

16.

Evaluating the impact of design pattern and anti-pattern dependencies on changes and faults

Fehmi Jaafar Yann-Gaël Guéhéneuc Sylvie Hamel Foutse Khomh Mohammad Zulkernine 《Empirical Software Engineering》2016,21(3):896-931

On the one hand, design patterns are solutions to recurring design problems, aimed at increasing reuse, flexibility, and maintainability. However, much prior work found that some patterns, such as the Observer and Singleton, are correlated with large code structures and argued that they are more likely to be fault prone. On the other hand, anti-patterns describe poor solutions to design and implementation problems that highlight weaknesses in the design of software systems and that may slow down maintenance and increase the risk of faults. They have been found to negatively impact change and fault-proneness. Classes participating in design patterns and anti-patterns have dependencies with other classes, e.g., static and co-change dependencies, that may propagate problems to other classes. We investigate the impact of such dependencies in object-oriented systems by studying the relations between the presence of static and co-change dependencies and (1) the fault-proneness, (2) the types of changes, and (3) the types of faults that these classes exhibit. We analyze six design patterns and 10 anti-patterns in 39 releases of ArgoUML, JFreeChart, and XercesJ, and investigate to what extent classes having dependencies with design patterns or anti-patterns have higher odds of faults than other classes. We show that in almost all releases of the three systems, classes having dependencies with anti-patterns are more fault-prone than others while this is not always true for classes with dependencies with design patterns. We also observe that structural changes are the most common changes impacting classes having dependencies with anti-patterns. Software developers could use this knowledge about the impact of design pattern and anti-pattern dependencies to better focus their testing and reviewing activities towards the most risky classes and to propagate changes adequately. 相似文献

17.

软件产品线度量及应用研究 总被引：1，自引：0，他引：1

宁安良侯红鱼滨郝克刚《计算机应用与软件》2007,24(9):60-62,71

软件复用是提高软件生产力、软件质量的最有潜力的领域,软件产品线实质上是最高级别的软件复用.软件产品线对当前的软件密集性项目的管理提出了新的挑战,它需要管理者有超越单一产品的战略考虑,需要有组织性的预见、调查、规划和指导.一直以来软件过程管理和改进的重要思路是:在软件产品开发管理实践中应用软件度量技术,即通过分析软件过程和产品的相关属性,从而为管理决策实践提供客观的数据支持.针对软件产品线管理的一些关键目标,提出了一些重要的度量技术思路,分析了相应的度量指标,满足了软件产品线各层次管理角色的不同的信息需求. 相似文献

18.

Semantic metrics for software products

A. Mili A. Jaoua M. Frias Rasha Gaffer Mohamed Helali 《Innovations in Systems and Software Engineering》2014,10(3):203-217

Like all engineering disciplines, software engineering relies on quantitative analysis to support rationalized decision making. Software engineering researchers and practitioners have traditionally relied on software metrics to quantify attributes of software products and processes. Whereas traditional software metrics are typically based on a syntactic analysis of software products, we introduce and discuss metrics that are based on a semantic analysis: our metrics do not reflect the form or structure of software products, but rather the properties of their function. At a time when software systems grow increasingly large and complex, the focus on diagnosing, identifying and removing every fault in the software product ought to relinquish the stage to a more measured, more balanced, and more realistic approach, which emphasizes failure avoidance, in addition to fault avoidance and fault removal. Semantic metrics are a good fit for this purpose, reflecting as they do a system’s ability to avoid failure rather than its proneness to being free of faults. 相似文献

19.

Software quality measurement based on fault-detection data

Weerahandi S. Hausman R.E. 《IEEE transactions on pattern analysis and machine intelligence》1994,20(9):665-676

We develop a methodology to measure the quality levels of a number of releases of a software product in its evolution process. The proposed quality measurement plan is based on the faults detected in field operation of the software. We describe how fault discovery data can be analyzed and reported in a framework very similar to that of the QMP (quality measurement plan) proposed by B. Hoadley (1986). The proposed procedure is especially useful in situations where one has only very little data from the latest release. We present details of implementation of solutions to a class of models on the distribution of fault detection times. The conditions under which the families: exponential, Weibull, or Pareto distributions might be appropriate for fault detection times are discussed. In a variety of typical data sets that we investigated one of these families was found to provide a good fit for the data. The proposed methodology is illustrated with an example involving three releases of a software product, where the fault detection times are exponentially distributed. Another example for a situation where the exponential fit is not good enough is also considered 相似文献

20.

Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques 总被引：6，自引：0，他引：6

Taghi M. Khoshgoftaar Naeem Seliya 《Empirical Software Engineering》2003,8(3):255-283

High-assurance and complex mission-critical software systems are heavily dependent on reliability of their underlying software applications. An early software fault prediction is a proven technique in achieving high software reliability. Prediction models based on software metrics can predict number of faults in software modules. Timely predictions of such models can be used to direct cost-effective quality enhancement efforts to modules that are likely to have a high number of faults. We evaluate the predictive performance of six commonly used fault prediction techniques: CART-LS (least squares), CART-LAD (least absolute deviation), S-PLUS, multiple linear regression, artificial neural networks, and case-based reasoning. The case study consists of software metrics collected over four releases of a very large telecommunications system. Performance metrics, average absolute and average relative errors, are utilized to gauge the accuracy of different prediction models. Models were built using both, original software metrics (RAW) and their principle components (PCA). Two-way ANOVA randomized-complete block design models with two blocking variables are designed with average absolute and average relative errors as response variables. System release and the model type (RAW or PCA) form the blocking variables and the prediction technique is treated as a factor. Using multiple-pairwise comparisons, the performance order of prediction models is determined. We observe that for both average absolute and average relative errors, the CART-LAD model performs the best while the S-PLUS model is ranked sixth. 相似文献