首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It is always better to have an idea about the future situation of a present work. Prediction of software faults in the early phase of software development life cycle can facilitate to the software personnel to achieve their desired software product. Early prediction is of great importance for optimizing the development cost of a software project. The present study proposes a methodology based on Bayesian belief network, developed to predict total number of faults and to reach a target value of total number of faults during early development phase of software lifecycle. The model has been carried out using the information from similar or earlier version software projects, domain expert’s opinion and the software metrics. Interval type-2 fuzzy logic has been applied for obtaining the conditional probability values in the node probability tables of the belief network. The output pattern corresponding to the total number of faults has been identified by artificial neural network using the input pattern from similar or earlier project data. The proposed Bayesian framework facilitates software personnel to gain the required information about software metrics at early phase for achieving targeted number of software faults. The proposed model has been applied on twenty six software project data. Results have been validated by different statistical comparison criterion. The performance of the proposed approach has been compared with some existing early fault prediction models.  相似文献   

2.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

3.
针对软件缺陷数据集中不相关特征和冗余特征会降低软件缺陷个数预测模型的性能的问题,提出了一种面向软件缺陷个数预测的混合式特征选择方法-HFSNFP。首先,利用ReliefF算法计算每个特征与缺陷个数之间的相关性,选出相关性最高的m个特征;然后,基于特征之间的关联性利用谱聚类对这m个特征进行聚类;最后,利用基于包裹式特征选择思想从每个簇中依次挑选最相关的特征形成最终的特征子集。实验结果表明,相比于已有的五种过滤式特征选择方法,HFSNFP方法在提高预测率的同时降低了误报率,且G-measure与RMSE度量值更佳;相比于已有的两种包裹式特征选择方法,HFSNFP方法在保证了缺陷个数预测性能的同时可以显著降低特征选择的时间。  相似文献   

4.
An empirical study of predicting software faults with case-based reasoning   总被引:1,自引:0,他引:1  
The resources allocated for software quality assurance and improvement have not increased with the ever-increasing need for better software quality. A targeted software quality inspection can detect faulty modules and reduce the number of faults occurring during operations. We present a software fault prediction modeling approach with case-based reasoning (CBR), a part of the computational intelligence field focusing on automated reasoning processes. A CBR system functions as a software fault prediction model by quantifying, for a module under development, the expected number of faults based on similar modules that were previously developed. Such a system is composed of a similarity function, the number of nearest neighbor cases used for fault prediction, and a solution algorithm. The selection of a particular similarity function and solution algorithm may affect the performance accuracy of a CBR-based software fault prediction system. This paper presents an empirical study investigating the effects of using three different similarity functions and two different solution algorithms on the prediction accuracy of our CBR system. The influence of varying the number of nearest neighbor cases on the performance accuracy is also explored. Moreover, the benefits of using metric-selection procedures for our CBR system is also evaluated. Case studies of a large legacy telecommunications system are used for our analysis. It is observed that the CBR system using the Mahalanobis distance similarity function and the inverse distance weighted solution algorithm yielded the best fault prediction. In addition, the CBR models have better performance than models based on multiple linear regression. Taghi M. Khoshgoftaar is a professor of the Department of Computer Science and Engineering, Florida Atlantic University and the Director of the Empirical Software Engineering Laboratory. His research interests are in software engineering, software metrics, software reliability and quality engineering, computational intelligence, computer performance evaluation, data mining, and statistical modeling. He has published more than 200 refereed papers in these areas. He has been a principal investigator and project leader in a number of projects with industry, government, and other research-sponsoring agencies. He is a member of the Association for Computing Machinery, the IEEE Computer Society, and IEEE Reliability Society. He served as the general chair of the 1999 International Symposium on Software Reliability Engineering (ISSRE’99), and the general chair of the 2001 International Conference on Engineering of Computer Based Systems. Also, he has served on technical program committees of various international conferences, symposia, and workshops. He has served as North American editor of the Software Quality Journal, and is on the editorial boards of the journals Empirical Software Engineering, Software Quality, and Fuzzy Systems. Naeem Seliya received the M.S. degree in Computer Science from Florida Atlantic University, Boca Raton, FL, USA, in 2001. He is currently a Ph.D. candidate in the Department of Computer Science and Engineering at Florida Atlantic University. His research interests include software engineering, computational intelligence, data mining, software measurement, software reliability and quality engineering, software architecture, computer data security, and network intrusion detection. He is a student member of the IEEE Computer Society and the Association for Computing Machinery.  相似文献   

5.
Predicting the location and number of faults in large software systems   总被引:6,自引:0,他引:6  
Advance knowledge of which files in the next release of a large software system are most likely to contain the largest numbers of faults can be a very valuable asset. To accomplish this, a negative binomial regression model has been developed and used to predict the expected number of faults in each file of the next release of a system. The predictions are based on the code of the file in the current release, and fault and modification history of the file from previous releases. The model has been applied to two large industrial systems, one with a history of 17 consecutive quarterly releases over 4 years, and the other with nine releases over 2 years. The predictions were quite accurate: for each release of the two systems, the 20 percent of the files with the highest predicted number of faults contained between 71 percent and 92 percent of the faults that were actually detected, with the overall average being 83 percent. The same model was also used to predict which files of the first system were likely to have the highest fault densities (faults per KLOC). In this case, the 20 percent of the files with the highest predicted fault densities contained an average of 62 percent of the system's detected faults. However, the identified files contained a much smaller percentage of the code mass than the files selected to maximize the numbers of faults. The model was also used to make predictions from a much smaller input set that only contained fault data from integration testing and later. The prediction was again very accurate, identifying files that contained from 71 percent to 93 percent of the faults, with the average being 84 percent. Finally, a highly simplified version of the predictor selected files containing, on average, 73 percent and 74 percent of the faults for the two systems.  相似文献   

6.
7.
Programming faults are defined in the framework of the program verification schema (proof outline). Component S in program P is faulty if P cannot be proved correct with the current implementation of S but it can be proved using the implementation specification for S. A programming error is a state that violates that specification. Conditions for error propagation and masking are expressed in terms of the relationships between the implementation and design specification of S, which defines the role of S in the overall design of P. Errors propagate due to the dependencies between program entities. It is shown that “classical” static dependencies, developed for the purpose of code optimization, are inadequate for the analysis of error propagation since they do not capture events that occur on individual paths through the program. A novel path analysis method is proposed to identify variables potentially corrupted on a same path due the existence of the fault. The method is based upon error propagation axioms. The axioms are used to define path relations for structured programming constructs. The relations provide a conservative structural approximation to the semantical theory of error creation and propagation and are shown useful in testing, debugging and static analysis.  相似文献   

8.
9.
RNA二级结构预测在计算生物学中具有重要意义,针对RNA二级结构预测,提出了一种新的免疫粒子群集成算法,根据个体的浓度和适应值概率,利用免疫机制,在粒子群优化算法中设计了免疫替换算子,有效防止了粒子群优化算法易陷入局部最优的缺陷;通过集成技术,充分发挥各种粒子群优化算法的优点,实现协同演化,提高了算法的全局搜索能力。最后用免疫粒子群集成算法去预测RNA二级结构,实验证明了算法的有效性。  相似文献   

10.
Gastroscopy is important for finding suspicious stomach lesions, screening for gastric cancer, and providing early diagnoses. Due to the differences in the levels of diagnosis and treatment among gastroscope doctors, clinical diagnosis based on gastroscopy is limited by low diagnostic sensitivity and specificity to gastric cancer. An assistive system for gastroscopy report analysis can be helpful to improve the success rate of gastric cancer detection. In this study, a homogeneous ensemble decision support system for gastric cancer screening (Endo-GCS) that performs word segmentation, feature extraction, and gastric cancer screening on text-based gastroscopy reports is proposed. The proposed Endo-GCS method establishes a progressive local weighting algorithm that improves the overall prediction performance of the homogeneous ensemble model in gastric cancer screening. An optimal threshold estimation algorithm is developed to minimize the negative impact of misdiagnosis and missed diagnoses. Through a comparative experimental study using real gastroscopy report data, the pathological examination conclusion is the gold standard. The sensitivity of the proposed Endo-GCS method is 88.27%, the specificity is 77.84%, and the accuracy is 82.11%, which significantly improved the sensitivity 65.49% and the accuracy 80.5% of the gastroscopic diagnosis results, respectively.  相似文献   

11.
软件缺陷预测通过预先识别出被测项目内的潜在缺陷程序模块,可以优化测试资源的分配并提高软件产品的质量。论文对跨项目缺陷预测问题展开了深入研究,在源项目实例选择时,考虑了三种不同的实例相似度计算方法,并发现这些方法的缺陷预测结果存在多样性,因此提出了一种基于Box-Cox转换的集成跨项目软件缺陷预测方法BCEL,具体来说,首先基于不同的实例相似度计算方法,从候选集中选出不同的训练集,随后针对这些数据集,进行针对性的Box-Cox转化,并借助特定分类方法构造出不同的基分类器,最后将这三个基分类器进行有效集成。基于实际项目的数据集,验证了BCEL方法的有效性,并深入分析了BCEL方法内的影响因素对缺陷预测性能的影响。  相似文献   

12.
We present a new approach for creating repositories of real software faults. We have developed a tool, the Automatic Fault IDentification Tool (AFID), that implements this approach. AFID records both a fault revealing test case and a faulty version of the source code for any crashing faults that the developer discovers and a fault correcting source code change for any crashing faults that the developer corrects. The test cases are a significant contribution, because they enable new research that explores the dynamic behaviors of the software faults. AFID uses an operating system level monitoring mechanism to monitor both the compilation and execution of the application. This technique makes it straightforward for AFID to support a wide range of programming languages and compilers.  相似文献   

13.
Application of neural networks for predicting program faults   总被引:1,自引:0,他引:1  
Accurately predicting the number of faults in program modules is a major problem in the quality control of large software development efforts. Some software complexity metrics are closely related to the distribution of faults across program modules. Using these relationships, software engineers develop models that provide early estimates of quality metrics that do not become available until late in the development cycle. By considering these early estimates, software engineers can take actions to avoid or prepare for emerging quality problems. Most often, the predictive models are based upon multiple regression analysis. However, measures of software quality and complexity exhibit systematic departures from the assumptions of these analyses. With extreme violations of these assumptions, multiple regression models become unstable and lose most of their predictive quality. Since neural network models carry no data assumptions, these models could be more appropriate than regression models for modeling software faults. In this paper, we explore a neural network methodology for developing models that predict the number of faults in program modules. We apply this methodology to develop neural network models based upon data collected during the development of two commercial software systems. After developing neural network models, we apply multiple linear regression methods to develop regression models on the same data. For the data sets considered, the neural network methodology produced better predictive models in terms of both quality of fit and predictive quality.  相似文献   

14.
基于神经网络集成的软件可靠性预测研究   总被引:1,自引:0,他引:1  
为解决软件可靠性预测精度差和泛化能力不强问题,提出一种遗传算法集成神经网络的软件可靠性预测模型.通过遗传算法对神经网络集成权重进行了优化,并用主成分分析方法对软件属性度量数据进行了预处理,降低数据维数,简化神经网络的结构,加快神经网络的运算速度.仿真实验结果表明,基于遗传算法集成神经网络的软件可靠性预测模型同BP网络、LVQ网络和PNN网络相比具有更好的预测精度和泛化能力.  相似文献   

15.
(Semi-)automated diagnosis of software faults can drastically increase debugging efficiency, improving reliability and time-to-market. Current automatic diagnosis techniques are predominantly of a statistical nature and, despite typical defect densities, do not explicitly consider multiple faults, as also demonstrated by the popularity of the single-fault benchmark set of programs. We present a reasoning approach, called Zoltar-M(ultiple fault), that yields multiple-fault diagnoses, ranked in order of their probability. Although application of Zoltar-M to programs with many faults requires heuristics (trading-off completeness) to reduce the inherent computational complexity, theory as well as experiments on synthetic program models and multiple-fault program versions available from the software infrastructure repository (SIR) show that for multiple-fault programs this approach can outperform statistical techniques, notably spectrum-based fault localization (SFL). As a side-effect of this research, we present a new SFL variant, called Zoltar-S(ingle fault), that is optimal for single-fault programs, outperforming all other variants known to date.  相似文献   

16.
为提高二级结构预测精度,试用神经网络集成法预测.针对BRNN网络结构复杂、收敛时间长、参数多的缺点,本文提出一种改进的新BRNN网络,删除BRNN左、右子网络的隐层,直接将输入连接到状态层,并采用BP改进算法中的弹性算法训练.以90条蛋白质序列共15 377个氨基酸交叉验证,仿真结果表明新网络可以有效地缩短收敛时间,新BRNN集成预测二级结构效果较好.  相似文献   

17.
Innovations in Systems and Software Engineering - Unlike several other engineering disciplines, software engineering lacks well-defined research strategies. However, with the exponential rise in...  相似文献   

18.
Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers’ interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号