共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Software managers are routinely confronted with software projects that contain errors or inconsistencies and exceed budget and time limits. By mining software repositories with comprehensible data mining techniques, predictive models can be induced that offer software managers the insights they need to tackle these quality and budgeting problems in an efficient way. This paper deals with the role that the Ant Colony Optimization (ACO)-based classification technique AntMiner+ can play as a comprehensible data mining technique to predict erroneous software modules. In an empirical comparison on three real-world public datasets, the rule-based models produced by AntMiner+ are shown to achieve a predictive accuracy that is competitive to that of the models induced by several other included classification techniques, such as C4.5, logistic regression and support vector machines. In addition, we will argue that the intuitiveness and comprehensibility of the AntMiner+ models can be considered superior to the latter models. 相似文献
3.
Software reliability is one of the most important software quality indicators. It is concerned with the probability that the software can execute without any unintended behavior in a given environment. In previous research we developed the Reliability Prediction System (RePS) methodology to predict the reliability of safety critical software such as those used in the nuclear industry. A RePS methodology relates the software engineering measures to software reliability using various models, and it was found that RePS’s using Extended Finite State Machine (EFSM) models and fault data collected through various software engineering measures possess the most satisfying prediction capability. In this research the EFSM-based RePS methodology is improved and implemented into a tool called Automated Reliability Prediction System (ARPS). The features of the ARPS tool are introduced with a simple case study. An experiment using human subjects was also conducted to evaluate the usability of the tool, and the results demonstrate that the ARPS tool can indeed help the analyst apply the EFSM-based RePS methodology with less number of errors and lower error criticality. 相似文献
4.
New methodologies and tools have gradually made the life cycle for software development more human-independent. Much of the
research in this field focuses on defect reduction, defect identification and defect prediction. Defect prediction is a relatively
new research area that involves using various methods from artificial intelligence to data mining. Identifying and locating
defects in software projects is a difficult task. Measuring software in a continuous and disciplined manner provides many
advantages such as the accurate estimation of project costs and schedules as well as improving product and process qualities.
This study aims to propose a model to predict the number of defects in the new version of a software product with respect
to the previous stable version. The new version may contain changes related to a new feature or a modification in the algorithm
or bug fixes. Our proposed model aims to predict the new defects introduced into the new version by analyzing the types of
changes in an objective and formal manner as well as considering the lines of code (LOC) change. Defect predictors are helpful
tools for both project managers and developers. Accurate predictors may help reducing test times and guide developers towards
implementing higher quality codes. Our proposed model can aid software engineers in determining the stability of software
before it goes on production. Furthermore, such a model may provide useful insight for understanding the effects of a feature,
bug fix or change in the process of defect detection.
相似文献
Ayşe Basar BenerEmail: |
5.
Khoshgoftaar T.M. Yi Liu Seliya N. 《Evolutionary Computation, IEEE Transactions on》2004,8(6):593-608
The knowledge, prior to system operations, of which program modules are problematic is valuable to a software quality assurance team, especially when there is a constraint on software quality enhancement resources. A cost-effective approach for allocating such resources is to obtain a prediction in the form of a quality-based ranking of program modules. Subsequently, a module-order model (MOM) is used to gauge the performance of the predicted rankings. From a practical software engineering point of view, multiple software quality objectives may be desired by a MOM for the system under consideration: e.g., the desired rankings may be such that 100% of the faults should be detected if the top 50% of modules with highest number of faults are subjected to quality improvements. Moreover, the management team for the same system may also desire that 80% of the faults should be accounted if the top 20% of the modules are targeted for improvement. Existing work related to MOM(s) use a quantitative prediction model to obtain the predicted rankings of program modules, implying that only the fault prediction error measures such as the average, relative, or mean square errors are minimized. Such an approach does not provide a direct insight into the performance behavior of a MOM. For a given percentage of modules enhanced, the performance of a MOM is gauged by how many faults are accounted for by the predicted ranking as compared with the perfect ranking. We propose an approach for calibrating a multiobjective MOM using genetic programming. Other estimation techniques, e.g., multiple linear regression and neural networks cannot achieve multiobjective optimization for MOM(s). The proposed methodology facilitates the simultaneous optimization of multiple performance objectives for a MOM. Case studies of two industrial software systems are presented, the empirical results of which demonstrate a new promise for goal-oriented software quality modeling. 相似文献
6.
Context
Software defect prediction studies usually built models using within-company data, but very few focused on the prediction models trained with cross-company data. It is difficult to employ these models which are built on the within-company data in practice, because of the lack of these local data repositories. Recently, transfer learning has attracted more and more attention for building classifier in target domain using the data from related source domain. It is very useful in cases when distributions of training and test instances differ, but is it appropriate for cross-company software defect prediction?Objective
In this paper, we consider the cross-company defect prediction scenario where source and target data are drawn from different companies. In order to harness cross company data, we try to exploit the transfer learning method to build faster and highly effective prediction model.Method
Unlike the prior works selecting training data which are similar from the test data, we proposed a novel algorithm called Transfer Naive Bayes (TNB), by using the information of all the proper features in training data. Our solution estimates the distribution of the test data, and transfers cross-company data information into the weights of the training data. On these weighted data, the defect prediction model is built.Results
This article presents a theoretical analysis for the comparative methods, and shows the experiment results on the data sets from different organizations. It indicates that TNB is more accurate in terms of AUC (The area under the receiver operating characteristic curve), within less runtime than the state of the art methods.Conclusion
It is concluded that when there are too few local training data to train good classifiers, the useful knowledge from different-distribution training data on feature level may help. We are optimistic that our transfer learning method can guide optimal resource allocation strategies, which may reduce software testing cost and increase effectiveness of software testing process. 相似文献7.
为解决软件缺陷预测问题引入了最小二乘支持向量机算法(LS-SVM),加速了超参数的选择过程,给出了逐个加入新的样本用以模型校正的快捷方法,以软件复杂性度量为线索,建立了基于FLS-SVM的软件缺陷预测模型。通过具体实例阐明了模型的执行过程及小样本情况下比神经网络更为出色的预测能力,并根据回归方程指出了对软件缺陷影响显著的复杂性度量。 相似文献
8.
Modern service-oriented enterprise systems have increasingly complex and dynamic loosely-coupled architectures that often exhibit poor performance and resource efficiency and have high operating costs. This is due to the inability to predict at run-time the effect of workload changes on performance-relevant application-level dependencies and adapt the system configuration accordingly. Architecture-level performance models provide a powerful tool for performance prediction, however, current approaches to modeling the context of software components are not suitable for use at run-time. In this paper, we analyze typical online performance prediction scenarios and propose a performance meta-model for (i) expressing and resolving parameter and context dependencies, (ii) modeling service abstractions at different levels of granularity and (iii) modeling the deployment of software components in complex resource landscapes. The presented meta-model is a subset of the Descartes Meta-Model (DMM) for online performance prediction, specifically designed for use in online scenarios. We motivate and validate our approach in the context of realistic and representative online performance prediction scenarios based on the SPECjEnterprise2010 standard benchmark. 相似文献
9.
Mangla Monika Sharma Nonita Mohanty Sachi Nandan 《Innovations in Systems and Software Engineering》2022,18(2):301-308
Innovations in Systems and Software Engineering - Unlike several other engineering disciplines, software engineering lacks well-defined research strategies. However, with the exponential rise in... 相似文献
10.
This article tackles the problem of predicting effort (in person–hours) required to fix a software defect posted on an Issue Tracking System. The proposed method is inspired by the Nearest Neighbour Approach presented by the pioneering work of Weiss et al. (2007) [1]. We propose four enhancements to Weiss et al. (2007) [1]: Data Enrichment, Majority Voting, Adaptive Threshold and Binary Clustering. Data Enrichment infuses additional issue information into the similarity-scoring procedure, aiming to increase the accuracy of similarity scores. Majority Voting exploits the fact that many of the similar historical issues have repeating effort values, which are close to the actual. Adaptive Threshold automatically adjusts the similarity threshold to ensure that we obtain only the most similar matches. We use Binary Clustering if the similarity scores are very low, which might result in misleading predictions. This uses common properties of issues to form clusters (independent of the similarity scores) which are then used to produce the predictions. Numerical results are presented showing a noticeable improvement over the method proposed in Weiss et al. (2007) [1]. 相似文献
11.
Tiejian Wang Zhiwu Zhang Xiaoyuan Jing Liqiang Zhang 《Automated Software Engineering》2016,23(4):569-590
Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers’ interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods. 相似文献
12.
Prediction of fault-prone modules provides one way to support software quality engineering through improved scheduling and project control. The primary goal of our research was to develop and refine techniques for early prediction of fault-prone modules. The objective of this paper is to review and improve an approach previously examined in the literature for building prediction models, i.e. principal component analysis (PCA) and discriminant analysis (DA). We present findings of an empirical study at Ericsson Telecom AB for which the previous approach was found inadequate for predicting the most fault-prone modules using software design metrics. Instead of dividing modules into fault-prone and not-fault-prone, modules are categorized into several groups according to the ordered number of faults. It is shown that the first discriminant coordinates (DC) statistically increase with the ordering of modules, thus improving prediction and prioritization efforts. The authors also experienced problems with the smoothing parameter as used previously for DA. To correct this problem and further improve predictability, separate estimation of the smoothing parameter is shown to be required. 相似文献
13.
基于生命周期的软件缺陷预测技术 总被引:1,自引:0,他引:1
为保证软件可靠性和软件质量,在基于软件开发周期的基础上,提出了一种利用PCA-BP模糊神经网络的软件缺陷预计方法.针对影响软件可靠性的各种因素,依据相关的标准,结合工程实践,选取了影响软件可靠性的度量元.收集了实际工程中的一类飞行控制软件的度量数据,利用提出的模型进行缺陷预测,并将预测结果与传统的BP神经网络模型计算的结果进行了对比.对比结果表明,与基于BP神经网络的预测方法相比较,结合了主成分分析方法的PCA-BP神经网络预测方法具有更快的收敛速度和更高的预测准确度. 相似文献
14.
In the field of software architecture, a paradigm shift is occurring from describing the outcome of architecting process to describing the Architectural Knowledge (AK) created and used during architecting. Many AK models have been defined to represent domain concepts and their relationships, and they can be used for sharing and reusing AK across organizations, especially in geographically distributed contexts. However, different AK domain models can represent concepts that are different, thereby making effective AK sharing challenging. In order to understand the mapping quality from one AK model to another when more than one AK model coexists, AK sharing quality prediction based on the concept differences across AK models is necessary. Previous works in this area lack validation in the actual practice of AK sharing. In this paper, we carry out validation using four AK sharing case studies. We also improve the previous prediction models. We developed a new advanced mapping quality prediction model, this model (i) improves the prediction accuracy of the recall rate of AK sharing quality; (ii) provides a better balance between prediction effort and accuracy for AK sharing quality. 相似文献
15.
We study the possibility of constructing decision trees with evolutionary algorithms in order to increase their predictive accuracy. We present a self-adapting evolutionary algorithm for the induction of decision trees and describe the principle of decision making based on multiple evolutionary induced decision trees—decision forest. The developed model is used as a fault predictive approach to foresee dangerous software modules, which identification can largely enhance the reliability of software. 相似文献
16.
A new software for prediction of femoral neck fractures 总被引:2,自引:0,他引:2
Testi D Cappello A Sgallari F Rumpf M Viceconti M 《Computer methods and programs in biomedicine》2004,75(2):141-145
Femoral neck fractures are an important clinical, social and economic problem. Even if many different attempts have been carried out to improve the accuracy predicting the fracture risk, it was demonstrated in retrospective studies that the standard clinical protocol achieves an accuracy of about 65%. A new procedure was developed including for the prediction not only bone mineral density but also geometric and femoral strength information and achieving an accuracy of about 80% in a previous retrospective study. Aim of the present work was to re-engineer research-based procedures and develop a real-time software for the prediction of the risk for femoral fracture. The result was efficient, repeatable and easy to use software for the evaluation of the femoral neck fracture risk to be inserted in the daily clinical practice providing a useful tool for the improvement of fracture prediction. 相似文献
17.
Heiko Koziolek Bastian Schlich Steffen Becker Michael Hauck 《Empirical Software Engineering》2013,18(4):746-790
During software system evolution, software architects intuitively trade off the different architecture alternatives for their extra-functional properties, such as performance, maintainability, reliability, security, and usability. Researchers have proposed numerous model-driven prediction methods based on queuing networks or Petri nets, which claim to be more cost-effective and less error-prone than current practice. Practitioners are reluctant to apply these methods because of the unknown prediction accuracy and work effort. We have applied a novel model-driven prediction method called Q-ImPrESS on a large-scale process control system from ABB consisting of several million lines of code. This paper reports on the achieved performance prediction accuracy and reliability prediction sensitivity analyses as well as the effort in person hours for achieving these results. 相似文献
18.
Comparing software prediction techniques using simulation 总被引:2,自引:0,他引:2
Shepperd M. Kadoda G. 《IEEE transactions on pattern analysis and machine intelligence》2001,27(11):1014-1022
The need for accurate software prediction systems increases as software becomes much larger and more complex. We believe that the underlying characteristics: size, number of features, type of distribution, etc., of the data set influence the choice of the prediction system to be used. For this reason, we would like to control the characteristics of such data sets in order to systematically explore the relationship between accuracy, choice of prediction system, and data set characteristic. It would also be useful to have a large validation data set. Our solution is to simulate data allowing both control and the possibility of large (1000) validation cases. The authors compare four prediction techniques: regression, rule induction, nearest neighbor (a form of case-based reasoning), and neural nets. The results suggest that there are significant differences depending upon the characteristics of the data set. Consequently, researchers should consider prediction context when evaluating competing prediction systems. We observed that the more messy the data and the more complex the relationship with the dependent variable, the more variability in the results. In the more complex cases, we observed significantly different results depending upon the particular training set that has been sampled from the underlying data set. However, our most important result is that it is more fruitful to ask which is the best prediction system in a particular context rather than which is the best prediction system 相似文献
19.
Giuseppe Romanazzi Peter K. JimackChristopher E. Goodyer 《Advances in Engineering Software》2011,42(5):247-258
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this model is to allow reliable predictions to be made for the performance of the software on large numbers of processors of a given parallel system, by only benchmarking the code on small numbers of processors. Having described the methods used, and emphasized the simplicity of their implementation, the approach is tested on a range of engineering software applications that are built upon the use of multigrid algorithms. Despite their simplicity, the models are demonstrated to provide both accurate and robust predictions across a range of different parallel architectures, partitioning strategies and multigrid codes. In particular, the effectiveness of the predictive methodology is shown for a practical engineering software implementation of an elastohydrodynamic lubrication solver. 相似文献
20.
Software change prediction is crucial in order to efficiently plan resource allocation during testing and maintenance phases of a software. Moreover, correct identification of change-prone classes in the early phases of software development life cycle helps in developing cost-effective, good quality and maintainable software. An effective software change prediction model should equally recognize change-prone and not change-prone classes with high accuracy. However, this is not the case as software practitioners often have to deal with imbalanced data sets where instances of one type of class is much higher than the other type. In such a scenario, the minority classes are not predicted with much accuracy leading to strategic losses. This study evaluates a number of techniques for handling imbalanced data sets using various data sampling methods and MetaCost learners on six open-source data sets. The results of the study advocate the use of resample with replacement sampling method for effective imbalanced learning. 相似文献