首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Estimation of reliability and the number of faults present in software in its early development phase, i.e., requirement analysis or design phase is very beneficial for developing reliable software with optimal cost. Software reliability prediction in early phase of development is highly desirable to the stake holders, software developers, managers and end users. Since, the failure data are unavailable in early phase of software development, different reliability relevant software metrics and similar project data are used to develop models for early software fault prediction. The proposed model uses the linguistic values of software metrics in fuzzy inference system to predict the total number of faults present in software in its requirement analysis phase. Considering specific target reliability, weightage of each input software metrics and size of software, an algorithm has been proposed here for developing general fuzzy rule base. For model validation of the proposed model, 20 real software project data have been used here. The linguistic values from four software metrics related to requirement analysis phase have been considered as model inputs. The performance of the proposed model has been compared with two existing early software fault prediction models.  相似文献   

2.
Prediction of fault-prone modules provides one way to support software quality engineering through improved scheduling and project control. The primary goal of our research was to develop and refine techniques for early prediction of fault-prone modules. The objective of this paper is to review and improve an approach previously examined in the literature for building prediction models, i.e. principal component analysis (PCA) and discriminant analysis (DA). We present findings of an empirical study at Ericsson Telecom AB for which the previous approach was found inadequate for predicting the most fault-prone modules using software design metrics. Instead of dividing modules into fault-prone and not-fault-prone, modules are categorized into several groups according to the ordered number of faults. It is shown that the first discriminant coordinates (DC) statistically increase with the ordering of modules, thus improving prediction and prioritization efforts. The authors also experienced problems with the smoothing parameter as used previously for DA. To correct this problem and further improve predictability, separate estimation of the smoothing parameter is shown to be required.  相似文献   

3.
基于AGA-LVQ神经网络的软件可靠性预测模型研究   总被引:1,自引:0,他引:1  
针对当前大多数软件可靠性预测模型预测准确率不高等问题,利用LVQ神经网络的非线性运算能力和自适应遗传算法(AGA)的参数寻优能力,提出了一种基于AGA-LVQ的软件可靠性预测模型。首先对待预测的数据用主成分分析(PCA)等方法进行预处理以降低维度,去除冗余和错误数据,然后根据自适应遗传算法来计算最优的LVQ神经网络初始权值向量,最后运用LVQ神经网络进行软件可靠性预测实验。通过与传统方法的对比,证明该方法具有较高的预测准确率。  相似文献   

4.
System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software metrics that are input to the models. In this study, we consider 20 types of metrics to develop a model using an extreme learning machine associated with various kernel methods. We evaluate the effectiveness of the mode using a proposed framework based on the cost and efficiency in the testing phases. The evaluation process is carried out by considering case studies for 30 object-oriented software systems. Experimental results demonstrate that the application of a fault prediction model is suitable for projects with the percentage of faulty classes below a certain threshold, which depends on the efficiency of fault identification (low: 47.28%; median: 39.24%; high: 25.72%). We consider nine feature selection techniques to remove the irrelevant metrics and to select the best set of source code metrics for fault prediction.  相似文献   

5.
基于神经网络集成的软件故障预测及实验分析   总被引:1,自引:0,他引:1  
软件系统故障预测是软件测试过程中软件可靠性研究的重点之一。利用软件系统测试过程中前期的故障相关信息进行建模,预测后期的软件故障信息,以便于后期测试和验证资源的合理分配。根据软件测试过程中已知的软件故障时间序列,利用非齐次泊松分布过程、神经网络、神经网络集成等方法对其进行建模。通过对三个实例分别建模,其预测平均相对误差G-O模型依次为3.02%、5.88%和6.58%,而神经网络集成模型为0.19%、1.88%和1.455%,实验结果表明神经网络集成模型具有更精确的预测能力。  相似文献   

6.
7.
Finding security vulnerabilities requires a different mindset than finding general faults in software—thinking like an attacker. Therefore, security engineers looking to prioritize security inspection and testing efforts may be better served by a prediction model that indicates security vulnerabilities rather than faults. At the same time, faults and vulnerabilities have commonalities that may allow development teams to use traditional fault prediction models and metrics for vulnerability prediction. The goal of our study is to determine whether fault prediction models can be used for vulnerability prediction or if specialized vulnerability prediction models should be developed when both models are built with traditional metrics of complexity, code churn, and fault history. We have performed an empirical study on a widely-used, large open source project, the Mozilla Firefox web browser, where 21% of the source code files have faults and only 3% of the files have vulnerabilities. Both the fault prediction model and the vulnerability prediction model provide similar ability in vulnerability prediction across a wide range of classification thresholds. For example, the fault prediction model provided recall of 83% and precision of 11% at classification threshold 0.6 and the vulnerability prediction model provided recall of 83% and precision of 12% at classification threshold 0.5. Our results suggest that fault prediction models based upon traditional metrics can substitute for specialized vulnerability prediction models. However, both fault prediction and vulnerability prediction models require significant improvement to reduce false positives while providing high recall.  相似文献   

8.
In this study, defect tracking is used as a proxy method to predict software readiness. The number of remaining defects in an application under development is one of the most important factors that allow one to decide if a piece of software is ready to be released. By comparing predicted number of faults and number of faults discovered in testing, software manager can decide whether the software is likely ready to be released or not.The predictive model developed in this research can predict: (i) the number of faults (defects) likely to exist, (ii) the estimated number of code changes required to correct a fault and (iii) the estimated amount of time (in minutes) needed to make the changes in respective classes of the application. The model uses product metrics as independent variables to do predictions. These metrics are selected depending on the nature of source code with regards to architecture layers, types of faults and contribution factors of these metrics. The use of neural network model with genetic training strategy is introduced to improve prediction results for estimating software readiness in this study. This genetic-net combines a genetic algorithm with a statistical estimator to produce a model which also shows the usefulness of inputs.The model is divided into three parts: (1) prediction model for presentation logic tier (2) prediction model for business tier and (3) prediction model for data access tier. Existing object-oriented metrics and complexity software metrics are used in the business tier prediction model. New sets of metrics have been proposed for the presentation logic tier and data access tier. These metrics are validated using data extracted from real world applications. The trained models can be used as tools to assist software mangers in making software release decisions.  相似文献   

9.
An empirical study of predicting software faults with case-based reasoning   总被引:1,自引:0,他引:1  
The resources allocated for software quality assurance and improvement have not increased with the ever-increasing need for better software quality. A targeted software quality inspection can detect faulty modules and reduce the number of faults occurring during operations. We present a software fault prediction modeling approach with case-based reasoning (CBR), a part of the computational intelligence field focusing on automated reasoning processes. A CBR system functions as a software fault prediction model by quantifying, for a module under development, the expected number of faults based on similar modules that were previously developed. Such a system is composed of a similarity function, the number of nearest neighbor cases used for fault prediction, and a solution algorithm. The selection of a particular similarity function and solution algorithm may affect the performance accuracy of a CBR-based software fault prediction system. This paper presents an empirical study investigating the effects of using three different similarity functions and two different solution algorithms on the prediction accuracy of our CBR system. The influence of varying the number of nearest neighbor cases on the performance accuracy is also explored. Moreover, the benefits of using metric-selection procedures for our CBR system is also evaluated. Case studies of a large legacy telecommunications system are used for our analysis. It is observed that the CBR system using the Mahalanobis distance similarity function and the inverse distance weighted solution algorithm yielded the best fault prediction. In addition, the CBR models have better performance than models based on multiple linear regression. Taghi M. Khoshgoftaar is a professor of the Department of Computer Science and Engineering, Florida Atlantic University and the Director of the Empirical Software Engineering Laboratory. His research interests are in software engineering, software metrics, software reliability and quality engineering, computational intelligence, computer performance evaluation, data mining, and statistical modeling. He has published more than 200 refereed papers in these areas. He has been a principal investigator and project leader in a number of projects with industry, government, and other research-sponsoring agencies. He is a member of the Association for Computing Machinery, the IEEE Computer Society, and IEEE Reliability Society. He served as the general chair of the 1999 International Symposium on Software Reliability Engineering (ISSRE’99), and the general chair of the 2001 International Conference on Engineering of Computer Based Systems. Also, he has served on technical program committees of various international conferences, symposia, and workshops. He has served as North American editor of the Software Quality Journal, and is on the editorial boards of the journals Empirical Software Engineering, Software Quality, and Fuzzy Systems. Naeem Seliya received the M.S. degree in Computer Science from Florida Atlantic University, Boca Raton, FL, USA, in 2001. He is currently a Ph.D. candidate in the Department of Computer Science and Engineering at Florida Atlantic University. His research interests include software engineering, computational intelligence, data mining, software measurement, software reliability and quality engineering, software architecture, computer data security, and network intrusion detection. He is a student member of the IEEE Computer Society and the Association for Computing Machinery.  相似文献   

10.
Software metrics-based quality estimation models can be effective tools for identifying which modules are likely to be fault-prone or not fault-prone. The use of such models prior to system deployment can considerably reduce the likelihood of faults discovered during operations, hence improving system reliability. A software quality classification model is calibrated using metrics from a past release or similar project, and is then applied to modules currently under development. Subsequently, a timely prediction of which modules are likely to have faults can be obtained. However, software quality classification models used in practice may not provide a useful balance between the two misclassification rates, especially when there are very few faulty modules in the system being modeled.This paper presents, in the context of case-based reasoning, two practical classification rules that allow appropriate emphasis on each type of misclassification as per the project requirements. The suggested techniques are especially useful for high-assurance systems where faulty modules are rare. The proposed generalized classification methods emphasize on the costs of misclassifications, and the unbalanced distribution of the faulty program modules. We illustrate the proposed techniques with a case study that consists of software measurements and fault data collected over multiple releases of a large-scale legacy telecommunication system. In addition to investigating the two classification methods, a brief relative comparison of the techniques is also presented. It is indicated that the level of classification accuracy and model-robustness observed for the case study would be beneficial in achieving high software reliability of its subsequent system releases. Similar observations are made from our empirical studies with other case studies.  相似文献   

11.
Software fault prediction is a process of developing modules that are used by developers in order to help them to detect faulty classes or faulty modules in early phases of the development life cycle and to determine the modules that need more refactoring in the maintenance phase. Software reliability means the probability of failure has occurred during a period of time, so when we describe a system as not reliable, it means that it contains many errors, and these errors can be accepted in some systems, but it may lead to crucial problems in critical systems like aircraft, space shuttle, and medical systems. Therefore, locating faulty software modules is an essential step because it helps defining the modules that need more refactoring or more testing. In this article, an approach is developed by integrating genetics algorithm (GA) with support vector machine (SVM) classifier and particle swarm algorithm for software fault prediction as a stand though for better software fault prediction technique. The developed approach is applied into 24 datasets (12-NASA MDP and 12-Java open-source projects), where NASA MDP is considered as a large-scale dataset and Java open-source projects are considered as a small-scale dataset. Results indicate that integrating GA with SVM and particle swarm algorithm improves the performance of the software fault prediction process when it is applied into large-scale and small-scale datasets and overcome the limitations in the previous studies.  相似文献   

12.
遗传优化支持向量机的软件可靠性预测模型   总被引:5,自引:0,他引:5       下载免费PDF全文
软件可靠性预测在软件开发的早期就能预测出哪些模块有出错倾向。提出一种改进的支持向量机来进行软件可靠性预测。针对支持向量机参数难选择的问题,将遗传算法引入到支持向量机的参数选择中,构造基于遗传算法优化支持向量机的软件可靠性预测模型,并用主成分分析的方法对软件度量数据进行降维,通过仿真实验,证明该模型比支持向量机、BP神经网络、分类回归树和聚类分析等预测模型具有更高的预测精度。  相似文献   

13.
In this empirical study, we evaluate the extent to which a set of software measures are correlated with the number of faults and the total estimated repair effort for a large software system. The measures we use are basic counts reflecting program size and structure and metrics proposed by McCabe and Halstead. The effect of program size has a major influence on these metrics, and we present a suitable method of adjusting the metrics for size. In modeling faults or repair effort as a function of one variable, a number of measures individually explain approximately one-quarter of the variation observed in the fault data. No one measure does significantly better than size in explaining the variation in faults found across software units, and thus multiple variable models are necessary to find metrics of importance in addition to program size. The “best” multivariate model explains approximately one-half the variation in the fault data. The metrics included in this model (in addition to size) are: the ratio of block comments to total lines of code, the number of decisions per function, and the relative vocabulary of program variables and operators. These metrics have potential for future use in the quality control of software.  相似文献   

14.
The authors describe a number of results from a quantitative study of faults and failures in two releases of a major commercial software system. They tested a range of basic software engineering hypotheses relating to: the Pareto principle of distribution of faults and failures; the use of early fault data to predict later fault and failure data; metrics for fault prediction; and benchmarking fault data. For example, we found strong evidence that a small number of modules contain most of the faults discovered in prerelease testing and that a very small number of modules contain most of the faults discovered in operation. We found no evidence to support previous claims relating module size to fault density nor did we find evidence that popular complexity metrics are good predictors of either fault-prone or failure-prone modules. We confirmed that the number of faults discovered in prerelease testing is an order of magnitude greater than the number discovered in 12 months of operational use. The most important result was strong evidence of a counter-intuitive relationship between pre- and postrelease faults; those modules which are the most fault-prone prerelease are among the least fault-prone postrelease, while conversely, the modules which are most fault-prone postrelease are among the least fault-prone prerelease. This observation has serious ramifications for the commonly used fault density measure. Our results provide data-points in building up an empirical picture of the software development process  相似文献   

15.
This paper presents an empirical case study that predicted faults in modules based on the total information content of the operators. This metric is closely related to Harrison's average information content classification (AICC), which is the entropy of the operators. Most information theory-based metrics proposed in the literature have not been subjected to empirical predictive studies of real-world software systems. In contrast, this study shows that a simple information theory-based metric can be more useful for prediction of software quality than comparable metrics based on counts in the context of a commercial software development organization.Three models were considered, all based on operators as an abstraction of software. The model based on information content of the operators made more accurate predictions than two similar models based on the number of operators and the number of unique operators. The purpose of this paper is a fair comparison of the three metrics, rather than developing an optimal model. We have long advocated multivariate models for industrial use. The case study considered three large commercial systems, written in assembly language, and developed consecutively by professional programmers. The first system was used to estimate parameters of the models. The subsequent two were used to evaluate the accuracy of model predictions.  相似文献   

16.
软件故障静态预测方法综述   总被引:2,自引:0,他引:2  
软件故障静态预测通过从项目数据中提取度量信息预测故障,以便于测试和验证资源的分配。从可用度量数据和预测模型两个方面总结了软件故障静态预测方法,可用度量包括方法层、类层、构件层、文件层以及过程层度量,预测模型分为机器学习和统计方法两类;总结了性能评价指标、度量数据可得性以及故障分类对故障预测的影响等需要进一步研究的问题。  相似文献   

17.
静态软件缺陷预测方法研究   总被引:14,自引:7,他引:7  
静态软件缺陷预测是软件工程数据挖掘领域中的一个研究热点.通过分析软件代码或开发过程,设计出与软件缺陷相关的度量元;随后,通过挖掘软件历史仓库来创建缺陷预测数据集,旨在构建出缺陷预测模型,以预测出被测项目内的潜在缺陷程序模块,最终达到优化测试资源分配和提高软件产品质量的目的.对近些年来国内外学者在该研究领域取得的成果进行了系统的总结.首先,给出了研究框架并识别出了影响缺陷预测性能的3个重要影响因素:度量元的设定、缺陷预测模型的构建方法和缺陷预测数据集的相关问题;接着,依次总结了这3个影响因素的已有研究成果;随后,总结了一类特殊的软件缺陷预测问题(即,基于代码修改的缺陷预测)的已有研究工作;最后,对未来研究可能面临的挑战进行了展望.  相似文献   

18.
The knowledge, prior to system operations, of which program modules are problematic is valuable to a software quality assurance team, especially when there is a constraint on software quality enhancement resources. A cost-effective approach for allocating such resources is to obtain a prediction in the form of a quality-based ranking of program modules. Subsequently, a module-order model (MOM) is used to gauge the performance of the predicted rankings. From a practical software engineering point of view, multiple software quality objectives may be desired by a MOM for the system under consideration: e.g., the desired rankings may be such that 100% of the faults should be detected if the top 50% of modules with highest number of faults are subjected to quality improvements. Moreover, the management team for the same system may also desire that 80% of the faults should be accounted if the top 20% of the modules are targeted for improvement. Existing work related to MOM(s) use a quantitative prediction model to obtain the predicted rankings of program modules, implying that only the fault prediction error measures such as the average, relative, or mean square errors are minimized. Such an approach does not provide a direct insight into the performance behavior of a MOM. For a given percentage of modules enhanced, the performance of a MOM is gauged by how many faults are accounted for by the predicted ranking as compared with the perfect ranking. We propose an approach for calibrating a multiobjective MOM using genetic programming. Other estimation techniques, e.g., multiple linear regression and neural networks cannot achieve multiobjective optimization for MOM(s). The proposed methodology facilitates the simultaneous optimization of multiple performance objectives for a MOM. Case studies of two industrial software systems are presented, the empirical results of which demonstrate a new promise for goal-oriented software quality modeling.  相似文献   

19.
Ordering Fault-Prone Software Modules   总被引:4,自引:0,他引:4  
Software developers apply various techniques early in development to improve software reliability, such as extra reviews, additional testing, and strategic assignment of personnel. Due to limited resources and time, it is often not practical to enhance the reliability of all modules. Our goal is to target reliability enhancement activities to those modules that would otherwise have problems later. Prior research has shown that a software quality model based on software product and process metrics can predict which modules are likely to have faults.A module-order model is a quantitative software quality model that is used to predict the rank-order of modules according to a quality factor, such as the number of faults. The contribution of this paper is definition of module-order models and a method for their evaluation and use. Two empirical case studies of full-scale industrial software systems provide empirical evidence of the usefulness of module-order models for targeting reliability enhancement.  相似文献   

20.
张晓风  张德平 《计算机科学》2016,43(Z11):486-489, 494
软件缺陷预测是软件可靠性研究的一个重要方向。由于影响软件失效的因素有很多,相互之间关联关系复杂,在分析建模中常用联合分布函数来描述,而实际应用中难以确定,直接影响软件失效预测。基于拟似然估计提出一种软件失效预测方法,通过主成分分析筛选影响软件失效的主要影响因素,建立多因素软件失效预测模型,利用这些影响因素的数字特征(均值函数和方差函数)以及采用拟似然估计方法估计出模型参数,进而对软件失效进行预测分析。基于两个真实数据集Eclipse JDT和Eclipse PDE,与经典Logistic回归和Probit回归预测模型进行实验对比分析,结果表明采用拟似然估计对软件缺陷预测具有可行性,且预测精度均优于这两种经典回归预测模型。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号