首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We have in previous studies reported our findings and concern about the reliability and validity of the evaluation procedures used in comparative studies on competing effort prediction models. In particular, we have raised concerns about the use of accuracy statistics to rank and select models. Our concern is strengthened by the observed lack of consistent findings. This study offers more insights into the causes of conclusion instability by elaborating on the findings of our previous work concerning the reliability and validity of the evaluation procedures. We show that model selection based on the accuracy statistics MMRE, MMER, MBRE, and MIBRE contribute to conclusion instability as well as selection of inferior models. We argue and show that the evaluation procedure must include an evaluation of whether the functional form of the prediction model makes sense to better prevent selection of inferior models.  相似文献   

2.
Software managers are routinely confronted with software projects that contain errors or inconsistencies and exceed budget and time limits. By mining software repositories with comprehensible data mining techniques, predictive models can be induced that offer software managers the insights they need to tackle these quality and budgeting problems in an efficient way. This paper deals with the role that the Ant Colony Optimization (ACO)-based classification technique AntMiner+ can play as a comprehensible data mining technique to predict erroneous software modules. In an empirical comparison on three real-world public datasets, the rule-based models produced by AntMiner+ are shown to achieve a predictive accuracy that is competitive to that of the models induced by several other included classification techniques, such as C4.5, logistic regression and support vector machines. In addition, we will argue that the intuitiveness and comprehensibility of the AntMiner+ models can be considered superior to the latter models.  相似文献   

3.
A critique of software defect prediction models   总被引:4,自引:0,他引:4  
Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the “quality” of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of “software decomposition” in order to test hypotheses about defect introduction and help construct a better science of software engineering  相似文献   

4.
This paper presents a methodology for maintaining the operational validity of simulation models of observable systems in order to support operational decisions. In this methodology, real-time system data are continuously compared against simultaneous prediction intervals on selected responses constructed using the simulation model. The methodology is illustrated through using a case example of a simulation model of a flexible manufacturing system. Different invalidating discrepancies between the model and the system are investigated. Results indicate that using nontraditional responses may lead to a faster detection of invalidating changes, the speed of detection is a function of the scope of the change, and the model may evolve with the system and continue to be used to guard against random changes.  相似文献   

5.
This paper provides a systematic review of previous software fault prediction studies with a specific focus on metrics, methods, and datasets. The review uses 74 software fault prediction papers in 11 journals and several conference proceedings. According to the review results, the usage percentage of public datasets increased significantly and the usage percentage of machine learning algorithms increased slightly since 2005. In addition, method-level metrics are still the most dominant metrics in fault prediction research area and machine learning algorithms are still the most popular methods for fault prediction. Researchers working on software fault prediction area should continue to use public datasets and machine learning algorithms to build better fault predictors. The usage percentage of class-level is beyond acceptable levels and they should be used much more than they are now in order to predict the faults earlier in design phase of software life cycle.  相似文献   

6.
A core assumption of any prediction model is that test data distribution does not differ from training data distribution. Prediction models used in software engineering are no exception. In reality, this assumption can be violated in many ways resulting in inconsistent and non-transferrable observations across different cases. The goal of this paper is to explain the phenomena of conclusion instability through the dataset shift concept from software effort and fault prediction perspective. Different types of dataset shift are explained with examples from software engineering, and techniques for addressing associated problems are discussed. While dataset shifts in the form of sample selection bias and imbalanced data are well-known in software engineering research, understanding other types is relevant for possible interpretations of the non-transferable results across different sites and studies. Software engineering community should be aware of and account for the dataset shift related issues when evaluating the validity of research outcomes.  相似文献   

7.
随着计算机技术的不断发展,同类产品可选择性越来越广泛,如何在种类繁多的同类产品中找出最适合的,这个问题可以通过选型测试或者对比测试解决,但是目前对于上述测试的结果如何分析还没有一个通用的方法、针对上述问题,本文结果相关文件的研究成果,提出了基于权值的软件测试对比分析模型,通过该模型可以直观、科学、客观的反应出各个产品的测试结果,  相似文献   

8.
Software Quality Journal - Vulnerability severity prediction (VSP) models provide useful insight for vulnerability prioritization and software maintenance. Previous studies have proposed a variety...  相似文献   

9.
Constructing an accurate effort prediction model is a challenge in Software Engineering. This paper presents three Bayesian statistical software effort prediction models for database-oriented software systems, which are developed using a specific 4GL toolsuite. The models consist of specification-based software size metrics and development team's productivity metric. The models are constructed based on the subjective knowledge of human expert and calibrated using empirical data collected from 17 software systems developed in the target environment. The models' predictive accuracy is evaluated using subsets of the same data, which were not used for the models' calibration. The results show that the models have achieved very good predictive accuracy in terms of MMRE and pred measures. Hence, it is confirmed that the Bayesian statistical models can predict effort successfully in the target environment. In comparison with commonly used multiple linear regression models, the Bayesian statistical models'predictive accuracy is equivalent in general. However, when the number of software systems used for the models' calibration becomes smaller than five, the predictive accuracy of the best Bayesian statistical models are significantly better than the multiple linear regression model. This result suggests that the Bayesian statistical models would be a better choice when software organizations/practitioners do not posses sufficient empirical data for the models' calibration. The authors expect these findings to encourage more researchers to investigate the use of Bayesian statistical models for predicting software effort.  相似文献   

10.
Two important problems which can affect the performance of classification models are high-dimensionality (an overabundance of independent features in the dataset) and imbalanced data (a skewed class distribution which creates at least one class with many fewer instances than other classes). To resolve these problems concurrently, we propose an iterative feature selection approach, which repeated applies data sampling (in order to address class imbalance) followed by feature selection (in order to address high-dimensionality), and finally we perform an aggregation step which combines the ranked feature lists from the separate iterations of sampling. This approach is designed to find a ranked feature list which is particularly effective on the more balanced dataset resulting from sampling while minimizing the risk of losing data through the sampling step and missing important features. To demonstrate this technique, we employ 18 different feature selection algorithms and Random Undersampling with two post-sampling class distributions. We also investigate the use of sampling and feature selection without the iterative step (e.g., using the ranked list from a single iteration, rather than combining the lists from multiple iterations), and compare these results from the version which uses iteration. Our study is carried out using three groups of datasets with different levels of class balance, all of which were collected from a real-world software system. All of our experiments use four different learners and one feature subset size. We find that our proposed iterative feature selection approach outperforms the non-iterative approach.  相似文献   

11.
The numerical software users are often perturbed by accuracy problems imputable to the rounding errors generated by computer hardware. M. La Porte and J. Vignes (4) created the permutation-perturbation method for evaluating the validity of the solutions of linear algebraic systems, detection of the matrix singularity, and optimal termination criterion of iteratives methods. These problems exist and are considerably amplified in linear and non-linear programming algorithms using near simplex methods : Reduced Gradient of P. Wolfe and Generalized Reduced Gradient of J. Abadie (1) ; effectively, these methods proceed with a long sequence of matrix inversions which increases rounding errors, and it is not unu4sual to obtain false basic solutions or singular basic matrices. Moreover, the classical termination criterions of unconstrained optimization may involve either an untimely stop of the algorithm producing a solution far from the optimum, or, on the contrary, a large number of unprofitable iterations which does not improve the current solution. I suggest, in this paper, some quick and efficient procedures for solving these problems.  相似文献   

12.
13.
Software development effort estimation (SDEE) is one of the main tasks in software project management. It is crucial for a project manager to efficiently predict the effort or cost of a software project in a bidding process, since overestimation will lead to bidding loss and underestimation will cause the company to lose money. Several SDEE models exist; machine learning models, especially neural network models, are among the most prominent in the field. In this study, four different neural network models—multilayer perceptron, general regression neural network, radial basis function neural network, and cascade correlation neural network—are compared with each other based on: (1) predictive accuracy centred on the mean absolute error criterion, (2) whether such a model tends to overestimate or underestimate, and (3) how each model classifies the importance of its inputs. Industrial datasets from the International Software Benchmarking Standards Group (ISBSG) are used to train and validate the four models. The main ISBSG dataset was filtered and then divided into five datasets based on the productivity value of each project. Results show that the four models tend to overestimate in 80 % of the datasets, and the significance of the model inputs varies based on the selected model. Furthermore, the cascade correlation neural network outperforms the other three models in the majority of the datasets constructed on the mean absolute residual criterion.  相似文献   

14.
15.
16.
Understanding how a program is constructed and how it functions are significant components of the task of maintaining or enhancing a computer program. We have analyzed vidoetaped protocols of experienced programmers as they enhanced a personnel data base program. Our analysis suggests that there are two strategies for program understanding, the systematic strategy and the as-needed strategy. The programmer using the systematic strategy traces data flow through the program in order to understand global program behavior. The programmer using the as-needed strategy focuses on local program behavior in order to localize study of the program. Our empirical data show that there is a strong relationship between using a systematic approach to acquire knowledge about the program and modifying the program successfully. Programmers who used the systematic approach to study the program constructed successful modifications; programmers who used the as-needed approach failed to construct successful modifications. Programmers who used the systematic strategy gathered knowledge about the causal interactions of the program's functional components. Programmers who used the as-needed strategy did not gather such causal knowledge and therefore failed to detect interactions among components of the program.  相似文献   

17.
基于模型的软件安全预测与分析   总被引:2,自引:0,他引:2  
为了有效表示和分析软件中存在的安全缺陷和隐患,基于模型的软件安全分析技术采用多层次建模技术实现安全特征的描述,在评价软件及软件组件间安全性的过程中提出软件安全预测技术.通过分析软件组成成分之间的关联度获得相关的安全距离,在此基础之上生成安全依赖图,最后根据安全依赖图进行安全预测和分析.基于模型的安全分析技术能够针对可能存在的安全隐患给出预测和分析,为软件的测试和维护提供依据和手段.  相似文献   

18.
软件Agent的通信模型   总被引:7,自引:0,他引:7  
在分布式多Agent系统中,Agent之间的通信是合作的基础,但Agent技术综合了人工智能、分布式计算、软件工程等多种领域的成就,应用面广,所以通信机制多种多样,非常灵活,分析了Agent系统中最常用的3种通信模型及其优缺点,适用领域,介绍了几种综合运用多种通信方式的实例,讨论了软件Agent通信中的安全问题并对未来的发展趋势进行了展望。  相似文献   

19.
A review is carried out on how queueing network models with blocking have been applied so far into the performance evaluation and prediction of Software Architectures (SA). Queueing network models with finite capacity queues and blocking have recently been introduced and applied as more realistic models of systems with finite capacity resources and population constraints. Queueing network models have been often adopted as models for the evaluation of software performance. Starting from our own experience, we observe the need of a more accurate definition of the performance models of SA to capture some features of the communication systems. We consider queueing networks with finite capacity and blocking after service (BAS) to represent some synchronization constraints that cannot be easily modeled with queueing network models with infinite capacity queues. We investigate the use of queueing networks with blocking as performance models of SA with concurrent components and synchronous communication. Queueing theoretic analysis is used to solve the queueing network model and study the synchronous communication and performance of concurrent software components. Our experience is supported by other approaches that also propose the use of queueing networks with blocking. Directions for future research work in the field are included.  相似文献   

20.

Context

Assessing software quality at the early stages of the design and development process is very difficult since most of the software quality characteristics are not directly measurable. Nonetheless, they can be derived from other measurable attributes. For this purpose, software quality prediction models have been extensively used. However, building accurate prediction models is hard due to the lack of data in the domain of software engineering. As a result, the prediction models built on one data set show a significant deterioration of their accuracy when they are used to classify new, unseen data.

Objective

The objective of this paper is to present an approach that optimizes the accuracy of software quality predictive models when used to classify new data.

Method

This paper presents an adaptive approach that takes already built predictive models and adapts them (one at a time) to new data. We use an ant colony optimization algorithm in the adaptation process. The approach is validated on stability of classes in object-oriented software systems and can easily be used for any other software quality characteristic. It can also be easily extended to work with software quality predictive problems involving more than two classification labels.

Results

Results show that our approach out-performs the machine learning algorithm C4.5 as well as random guessing. It also preserves the expressiveness of the models which provide not only the classification label but also guidelines to attain it.

Conclusion

Our approach is an adaptive one that can be seen as taking predictive models that have already been built from common domain data and adapting them to context-specific data. This is suitable for the domain of software quality since the data is very scarce and hence predictive models built from one data set is hard to generalize and reuse on new data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号