共查询到20条相似文献,搜索用时 15 毫秒
1.
Techniques for evaluating fault prediction models 总被引:1,自引:1,他引:0
Many statistical techniques have been proposed to predict fault-proneness of program modules in software engineering. Choosing
the “best” candidate among many available models involves performance assessment and detailed comparison, but these comparisons
are not simple due to the applicability of varying performance measures. Classifying a software module as fault-prone implies
the application of some verification activities, thus adding to the development cost. Misclassifying a module as fault free
carries the risk of system failure, also associated with cost implications. Methodologies for precise evaluation of fault
prediction models should be at the core of empirical software engineering research, but have attracted sporadic attention.
In this paper, we overview model evaluation techniques. In addition to many techniques that have been used in software engineering
studies before, we introduce and discuss the merits of cost curves. Using the data from a public repository, our study demonstrates
the strengths and weaknesses of performance evaluation techniques and points to a conclusion that the selection of the “best”
model cannot be made without considering project cost characteristics, which are specific in each development environment.
相似文献
Bojan CukicEmail: |
2.
An empirical analysis of information retrieval based concept location techniques in software comprehension 总被引:1,自引:1,他引:1
Brendan Cleary Chris Exton Jim Buckley Michael English 《Empirical Software Engineering》2009,14(1):93-130
Concept location, the problem of associating human oriented concepts with their counterpart solution domain concepts, is a
fundamental problem that lies at the heart of software comprehension. Recent research has attempted to alleviate the impact
of the concept location problem through the application of methods drawn from the information retrieval (IR) community. Here
we present a new approach based on a complimentary IR method which also has a sound basis in cognitive theory. We compare
our approach to related work through an experiment and present our conclusions. This research adapts and expands upon existing
language modelling frameworks in IR for use in concept location, in software systems. In doing so it is novel in that it leverages
implicit information available in system documentation. Surprisingly, empirical evaluation of this approach showed little
performance benefit overall and several possible explanations are forwarded for this finding.
相似文献
Michael EnglishEmail: |
3.
Software managers are routinely confronted with software projects that contain errors or inconsistencies and exceed budget and time limits. By mining software repositories with comprehensible data mining techniques, predictive models can be induced that offer software managers the insights they need to tackle these quality and budgeting problems in an efficient way. This paper deals with the role that the Ant Colony Optimization (ACO)-based classification technique AntMiner+ can play as a comprehensible data mining technique to predict erroneous software modules. In an empirical comparison on three real-world public datasets, the rule-based models produced by AntMiner+ are shown to achieve a predictive accuracy that is competitive to that of the models induced by several other included classification techniques, such as C4.5, logistic regression and support vector machines. In addition, we will argue that the intuitiveness and comprehensibility of the AntMiner+ models can be considered superior to the latter models. 相似文献
4.
Prediction of fault-prone modules provides one way to support software quality engineering through improved scheduling and project control. The primary goal of our research was to develop and refine techniques for early prediction of fault-prone modules. The objective of this paper is to review and improve an approach previously examined in the literature for building prediction models, i.e. principal component analysis (PCA) and discriminant analysis (DA). We present findings of an empirical study at Ericsson Telecom AB for which the previous approach was found inadequate for predicting the most fault-prone modules using software design metrics. Instead of dividing modules into fault-prone and not-fault-prone, modules are categorized into several groups according to the ordered number of faults. It is shown that the first discriminant coordinates (DC) statistically increase with the ordering of modules, thus improving prediction and prioritization efforts. The authors also experienced problems with the smoothing parameter as used previously for DA. To correct this problem and further improve predictability, separate estimation of the smoothing parameter is shown to be required. 相似文献
5.
通过软件缺陷预测可以指导软件开发过程中资源的分配,提高软件质量和软件可靠性。为了更好地利用软件开发过程中产生的数据,指导软件的开发,在介绍了软件缺陷管理,数据挖掘,软件开发信息库知识的基础上,将数据挖掘的知识应用到软件开发信息库中,从版本信息库和缺陷跟踪系统中提取相关数据,经过预处理后这些数据就成数据挖掘技术的研究对象,通过选取合适的软件度量元,利用这些度量元建立新的软件缺陷预测模型并验证了该模型的有效性。 相似文献
6.
软件缺陷预测通常针对代码表面特征训练预测模型并对新样本进行预测,忽视了代码背后隐藏的不同技术方面和主题,从而导致预测不准确。针对这种问题,提出了一种基于主题模型的软件缺陷预测方法。将软件代码库视为不同技术方面和主题的集合,不同的主题或技术方面有不同的缺陷倾向。采用LDA主题模型对不同主题及其缺陷倾向进行建模,根据建模结果计算主题度量,并将传统度量方式和主题度量结合进行模型训练和预测。实验结果显示,该方法相对传统的软件缺陷预测技术有高的准确性,并且可以在软件演化中保证模型相对稳定,可以适用于各种缺陷预测任务。 相似文献
7.
Erik Arisholm Author Vitae Author Vitae Eivind B. Johannessen Author Vitae 《Journal of Systems and Software》2010,83(1):2-17
This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE).The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost. 相似文献
8.
Panos Constantopoulos Sc.D. Matthias Jarke Dr.rer.pol. John Mylopoulos Ph.D. Yannis Vassiliou Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1995,4(1):1-43
We present an experimental software repository system that provides organization, storage, management, and access facilities for reusable software components. The system, intended as part of an applications development environment, supports the representation of information about requirements, designs and implementations of software, and offers facilities for visual presentation of the soft-ware objects. This article details the features and architecture of the repository system, the technical challenges and the choices made for the system development along with a usage scenario that illustrates its functionality. The system has been developed and evaluated within the context of the ITHACA project, a technology integration/software engineering project sponsored by the European Communities through the ESPRIT program, aimed at developing an integrated reuse-centered application development and support environment based on object-oriented techniques. 相似文献
9.
This paper provides a systematic review of previous software fault prediction studies with a specific focus on metrics, methods, and datasets. The review uses 74 software fault prediction papers in 11 journals and several conference proceedings. According to the review results, the usage percentage of public datasets increased significantly and the usage percentage of machine learning algorithms increased slightly since 2005. In addition, method-level metrics are still the most dominant metrics in fault prediction research area and machine learning algorithms are still the most popular methods for fault prediction. Researchers working on software fault prediction area should continue to use public datasets and machine learning algorithms to build better fault predictors. The usage percentage of class-level is beyond acceptable levels and they should be used much more than they are now in order to predict the faults earlier in design phase of software life cycle. 相似文献
10.
为提高软件缺陷严重程度的预测性能,通过充分考虑软件缺陷严重程度标签间的次序性,提出一种基于有序回归的软件缺陷严重程度预测方法ORESP.该方法首先使用基于Spearman的特征选择方法来识别并移除数据集内的冗余特征,随后使用基于比例优势模型的神经网络来构建预测模型.通过与五种经典分类方法的比较,所提的ORESP方法在四种不同类型的度量下均可取得更高的预测性能,其中基于平均0-1误差(MZE)评测指标,预测模型性能最大可提升10.3%;基于平均绝对误差(MAE)评测指标,预测模型性能最大可提升12.3%.除此之外,发现使用基于Spearman的特征选择方法可以有效提升ORESP方法的预测性能. 相似文献
11.
为保证飞行器信息处理设备供电电源模块的正常运行,对典型BUCK降压电路进行Simulink仿真模拟电容、电感退化过程,结合电容、电感理论退化模型确定供电电源模块的故障特征参数;对灰色模型建模过程进行分析,从数据的预处理和训练数据维数入手改进得到了一种具有适应度的数据平滑预处理灰色模型;等时间间隔测量信息处理设备供电电源的纹波电压值,分别用GM(1,1)模型,新陈代谢模型和改进的灰色模型对比分析,用平均相对残差作为预测精度的判断标准,验证了改进灰色模型的有效性和正确性。 相似文献
12.
Context
Comparing and contrasting evidence from multiple studies is necessary to build knowledge and reach conclusions about the empirical support for a phenomenon. Therefore, research synthesis is at the center of the scientific enterprise in the software engineering discipline.Objective
The objective of this article is to contribute to a better understanding of the challenges in synthesizing software engineering research and their implications for the progress of research and practice.Method
A tertiary study of journal articles and full proceedings papers from the inception of evidence-based software engineering was performed to assess the types and methods of research synthesis in systematic reviews in software engineering.Results
As many as half of the 49 reviews included in the study did not contain any synthesis. Of the studies that did contain synthesis, two thirds performed a narrative or a thematic synthesis. Only a few studies adequately demonstrated a robust, academic approach to research synthesis.Conclusion
We concluded that, despite the focus on systematic reviews, there is limited attention paid to research synthesis in software engineering. This trend needs to change and a repertoire of synthesis methods needs to be an integral part of systematic reviews to increase their significance and utility for research and practice. 相似文献13.
Erik Linstead Sushil Bajracharya Trung Ngo Paul Rigor Cristina Lopes Pierre Baldi 《Data mining and knowledge discovery》2009,18(2):300-336
Large repositories of source code available over the Internet, or within large organizations, create new challenges and opportunities
for data mining and statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated crawling,
parsing, fingerprinting, and database storage of open source software on an Internet-scale. In one experiment, we gather 4,632
Java projects from SourceForge and Apache totaling over 38 million lines of code from 9,250 developers. Simple statistical
analyses of the data first reveal robust power-law behavior for package, method call, and lexical containment distributions.
We then develop and apply unsupervised, probabilistic, topic and author-topic (AT) models to automatically discover the topics
embedded in the code and extract topic-word, document-topic, and AT distributions. In addition to serving as a convenient
summary for program function and developer activities, these and other related distributions provide a statistical and information-theoretic
basis for quantifying and analyzing source file similarity, developer similarity and competence, topic scattering, and document
tangling, with direct applications to software engineering an software development staffing. Finally, by combining software
textual content with structural information captured by our CodeRank approach, we are able to significantly improve software
retrieval performance, increasing the area under the curve (AUC) retrieval metric to 0.92– roughly 10–30% better than previous
approaches based on text alone. A prototype of the system is available at: .
Erik Linstead, Sushil Bajracharya, and Trung Ngo have contributed equally to this work. 相似文献
14.
《Information and Software Technology》2013,55(12):2076-2098
ContextFault localization lies at the heart of program debugging and often proceeds by contrasting the statistics of program constructs executed by passing and failing test cases. A vital issue here is how to obtain these “suitable” test cases. Techniques presented in the literature mostly assume the existence of a large test suite a priori. However, developers often encounter situations where a failure occurs, but where no or no appropriate test suite is available for use to localize the fault.ObjectiveThis paper aims to alleviate this key limitation of traditional fault localization techniques for GUI software particularly, namely, it aims at enabling cost-effective fault localization process for GUI software in the described scenario.MethodTo address this scenario, we propose a mutation-oriented test data augmentation technique, which actually is directed by the “similarity” criterion in GUI software’s test case context towards the generation of test suite with excellent fault localization capabilities. More specifically, the technique mainly uses four proposed novel mutation operators to iteratively mutate some failing GUI test cases’ event sequences to derive new test cases potentially useful to localize the specific encountered fault. We then compare the fault localization performance of the test suite generated using this technique with that of an original provided large event-pair adequate test suite on some GUI applications.ResultsThe results indicate that the proposed technique is capable of generating a test suite that has comparable, if not better, fault localization effectiveness to the event-pair adequate test suite, but it is much smaller and it is generated immediately once a failure is encountered by developers.ConclusionIt is concluded that the proposed technique can truly enable quick-start cost-effective fault localization process under the investigated all-too-common scenario, greatly alleviating one key limitation of traditional fault localization techniques and prompting the test–diagnose–repair cycle. 相似文献
15.
Competition among today’s industrial companies is very high. Therefore, system availability plays an important role and is a critical point for most companies. Detecting failures at an early stage or foreseeing them before they occur is crucial for machinery availability. Data analysis is the most common method for machine health condition monitoring. In this paper we propose a fault-detection system based on data stream prediction, data stream mining, and data stream management system (DSMS). Companies that are able to predict and avoid the occurrence of failures have an advantage over their competitors. The literature has shown that data prediction can also reduce the consumption of communication resources in distributed data stream processing. 相似文献
16.
动态软件体系结构语言已成为描述复杂软件体系结构的重要工具,然而许多描述语言都是静态的,并不能对动态软件体系进行描述。为此,对经典Z描述语言进行扩展,主要通过对构件、连接件和它们的添加以及删除来达到动态演化的目的。实例分析表明了这种扩展的可行性。 相似文献
17.
Since any faulty operations could directly affect the composite property, making early prognosis is particularly crucial for complex equipment. At present, data-driven approach has been typically used for fault prediction. However, for part of complex equipment, it is difficult to access reliable and sufficient data to train the fault prediction model. To address this issue, this paper takes autoclave as an example. A Digital Twin (DT) model containing multiple dimensions for the autoclave is firstly constructed and verified. Then the characteristics of autoclave under different conditions are analyzed and presented with specific parameters. The data in normal and faulty conditions are simulated by using the DT model. Both the simulated data and extracted historical data are applied to enhance fault prediction. A convolutional neural network for fault prediction will be trained with the generated data which matches the feature of the autoclave in faulty conditions. The effectiveness of the proposed method is verified through result analysis. 相似文献
18.
Packages are important high-level organizational units for large object-oriented systems. Package-level metrics characterize the attributes of packages such as size, complexity, and coupling. There is a need for empirical evidence to support the collection of these metrics and using them as early indicators of some important external software quality attributes. In this paper, three suites of package-level metrics (Martin, MOOD and CK) are evaluated and compared empirically in predicting the number of pre-release faults and the number of post-release faults in packages. Eclipse, one of the largest open source systems, is used as a case study. The results indicate that the prediction models that are based on Martin suite are more accurate than those that are based on MOOD and CK suites across releases of Eclipse. 相似文献
19.
Giuseppe Romanazzi Peter K. JimackChristopher E. Goodyer 《Advances in Engineering Software》2011,42(5):247-258
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this model is to allow reliable predictions to be made for the performance of the software on large numbers of processors of a given parallel system, by only benchmarking the code on small numbers of processors. Having described the methods used, and emphasized the simplicity of their implementation, the approach is tested on a range of engineering software applications that are built upon the use of multigrid algorithms. Despite their simplicity, the models are demonstrated to provide both accurate and robust predictions across a range of different parallel architectures, partitioning strategies and multigrid codes. In particular, the effectiveness of the predictive methodology is shown for a practical engineering software implementation of an elastohydrodynamic lubrication solver. 相似文献
20.
Regression analysis to generate predictive equations for software development effort estimation has recently been complemented by analyses using less common methods such as fuzzy logic models. On the other hand, unless engineers have the capabilities provided by personal training, they cannot properly support their teams or consistently and reliably produce quality products. In this paper, an investigation aimed to compare personal Fuzzy Logic Models (FLM) with a Linear Regression Model (LRM) is presented. The evaluation criteria were based mainly upon the magnitude of error relative to the estimate (MER) as well as to the mean of MER (MMER). One hundred five small programs were developed by thirty programmers. From these programs, three FLM were generated to estimate the effort in the development of twenty programs by seven programmers. Both the verification and validation of the models were made. Results show a slightly better predictive accuracy amongst FLM and LRM for estimating the development effort at personal level when small programs are developed. 相似文献