首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
遗传优化支持向量机的软件可靠性预测模型   总被引:5,自引:0,他引:5       下载免费PDF全文
软件可靠性预测在软件开发的早期就能预测出哪些模块有出错倾向。提出一种改进的支持向量机来进行软件可靠性预测。针对支持向量机参数难选择的问题,将遗传算法引入到支持向量机的参数选择中,构造基于遗传算法优化支持向量机的软件可靠性预测模型,并用主成分分析的方法对软件度量数据进行降维,通过仿真实验,证明该模型比支持向量机、BP神经网络、分类回归树和聚类分析等预测模型具有更高的预测精度。  相似文献   

2.
The authors describe a number of results from a quantitative study of faults and failures in two releases of a major commercial software system. They tested a range of basic software engineering hypotheses relating to: the Pareto principle of distribution of faults and failures; the use of early fault data to predict later fault and failure data; metrics for fault prediction; and benchmarking fault data. For example, we found strong evidence that a small number of modules contain most of the faults discovered in prerelease testing and that a very small number of modules contain most of the faults discovered in operation. We found no evidence to support previous claims relating module size to fault density nor did we find evidence that popular complexity metrics are good predictors of either fault-prone or failure-prone modules. We confirmed that the number of faults discovered in prerelease testing is an order of magnitude greater than the number discovered in 12 months of operational use. The most important result was strong evidence of a counter-intuitive relationship between pre- and postrelease faults; those modules which are the most fault-prone prerelease are among the least fault-prone postrelease, while conversely, the modules which are most fault-prone postrelease are among the least fault-prone prerelease. This observation has serious ramifications for the commonly used fault density measure. Our results provide data-points in building up an empirical picture of the software development process  相似文献   

3.
System analysts often use software fault prediction models to identify fault-prone modules during the design phase of the software development life cycle. The models help predict faulty modules based on the software metrics that are input to the models. In this study, we consider 20 types of metrics to develop a model using an extreme learning machine associated with various kernel methods. We evaluate the effectiveness of the mode using a proposed framework based on the cost and efficiency in the testing phases. The evaluation process is carried out by considering case studies for 30 object-oriented software systems. Experimental results demonstrate that the application of a fault prediction model is suitable for projects with the percentage of faulty classes below a certain threshold, which depends on the efficiency of fault identification (low: 47.28%; median: 39.24%; high: 25.72%). We consider nine feature selection techniques to remove the irrelevant metrics and to select the best set of source code metrics for fault prediction.  相似文献   

4.
Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.  相似文献   

5.
Automatically generating a brief summary for legal-related public opinion news (LPO-news, which contains legal words or phrases) plays an important role in rapid and effective public opinion disposal. For LPO-news, the critical case elements which are significant parts of the summary may be mentioned several times in the reader comments. Consequently, we investigate the task of comment-aware abstractive text summarization for LPO-news, which can generate salient summary by learning pivotal case elements from the reader comments. In this paper, we present a hierarchical comment-aware encoder (HCAE), which contains four components: 1) a traditional sequenceto-sequence framework as our baseline; 2) a selective denoising module to filter the noisy of comments and distinguish the case elements; 3) a merge module by coupling the source article and comments to yield comment-aware context representation; 4) a recoding module to capture the interaction among the source article words conditioned on the comments. Extensive experiments are conducted on a large dataset of legal public opinion news collected from micro-blog, and results show that the proposed model outperforms several existing state-of-the-art baseline models under the ROUGE metrics.  相似文献   

6.
Prediction of fault-prone modules provides one way to support software quality engineering through improved scheduling and project control. The primary goal of our research was to develop and refine techniques for early prediction of fault-prone modules. The objective of this paper is to review and improve an approach previously examined in the literature for building prediction models, i.e. principal component analysis (PCA) and discriminant analysis (DA). We present findings of an empirical study at Ericsson Telecom AB for which the previous approach was found inadequate for predicting the most fault-prone modules using software design metrics. Instead of dividing modules into fault-prone and not-fault-prone, modules are categorized into several groups according to the ordered number of faults. It is shown that the first discriminant coordinates (DC) statistically increase with the ordering of modules, thus improving prediction and prioritization efforts. The authors also experienced problems with the smoothing parameter as used previously for DA. To correct this problem and further improve predictability, separate estimation of the smoothing parameter is shown to be required.  相似文献   

7.
In the last decade, empirical studies on object-oriented design metrics have shown some of them to be useful for predicting the fault-proneness of classes in object-oriented software systems. This research did not, however, distinguish among faults according to the severity of impact. It would be valuable to know how object-oriented design metrics and class fault-proneness are related when fault severity is taken into account. In this paper, we use logistic regression and machine learning methods to empirically investigate the usefulness of object-oriented design metrics, specifically, a subset of the Chidamber and Kemerer suite, in predicting fault-proneness when taking fault severity into account. Our results, based on a public domain NASA data set, indicate that 1) most of these design metrics are statistically related to fault-proneness of classes across fault severity, and 2) the prediction capabilities of the investigated metrics greatly depend on the severity of faults. More specifically, these design metrics are able to predict low severity faults in fault-prone classes better than high severity faults in fault-prone classes  相似文献   

8.
Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity faultproneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.  相似文献   

9.
Software metrics-based quality estimation models can be effective tools for identifying which modules are likely to be fault-prone or not fault-prone. The use of such models prior to system deployment can considerably reduce the likelihood of faults discovered during operations, hence improving system reliability. A software quality classification model is calibrated using metrics from a past release or similar project, and is then applied to modules currently under development. Subsequently, a timely prediction of which modules are likely to have faults can be obtained. However, software quality classification models used in practice may not provide a useful balance between the two misclassification rates, especially when there are very few faulty modules in the system being modeled.This paper presents, in the context of case-based reasoning, two practical classification rules that allow appropriate emphasis on each type of misclassification as per the project requirements. The suggested techniques are especially useful for high-assurance systems where faulty modules are rare. The proposed generalized classification methods emphasize on the costs of misclassifications, and the unbalanced distribution of the faulty program modules. We illustrate the proposed techniques with a case study that consists of software measurements and fault data collected over multiple releases of a large-scale legacy telecommunication system. In addition to investigating the two classification methods, a brief relative comparison of the techniques is also presented. It is indicated that the level of classification accuracy and model-robustness observed for the case study would be beneficial in achieving high software reliability of its subsequent system releases. Similar observations are made from our empirical studies with other case studies.  相似文献   

10.
A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison. For its implementation with several useful optimization techniques, we have developed a tool, named CCFinder (Code Clone Finder), which extracts code clones in C, C++, Java, COBOL and other source files. In addition, metrics for the code clones have been developed. In order to evaluate the usefulness of CCFinder and metrics, we conducted several case studies where we applied the new tool to the source code of JDK, FreeBSD, NetBSD, Linux, and many other systems. As a result, CCFinder has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems. In addition, we have compared the proposed technique with other clone detection techniques.  相似文献   

11.

Subtractive Pixel Adjacency Matrix (SPAM) features perform well in detecting spatial-domain steganographic algorithm. Further, some methods of SPAM features can be applied to rich models and steganalysis based on deep learning. Therefore, this paper presents a study on SPAM features and it is divided into two parts: in the first part, impact of spatial-domain steganographic on difference between adjacent pixels is first analyzed. Then, three SPAM features are proposed with the same range of differences and different orders of Markov chain. Following that, the influences of order of Markov chain and range of differences on SPAM features are analyzed, and we find that detection accuracy of SPAM features increases with the range of differences increasing; in the second part, SPAM feature is first divided into several modules according to the conclusion. Then, taking detection accuracy of support vector machine (SVM) classifier and mutual information as metrics and module as a unit, a Novel Feature Selection (NFS) algorithm and an Improved Feature Selection algorithm are proposed. Experimental results show that the NFS algorithm can achieve higher detection accuracy than several existing algorithms.

  相似文献   

12.
Software metrics-based quality classification models predict a software module as either fault-prone (fp) or not fault-prone (nfp). Timely application of such models can assist in directing quality improvement efforts to modules that are likely to be fp during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. Since several classification techniques are available, a relative comparative study of some commonly used classification techniques can be useful to practitioners. We present a comprehensive evaluation of the relative performances of seven classification techniques and/or tools. These include logistic regression, case-based reasoning, classification and regression trees (CART), tree-based classification with S-PLUS, and the Sprint-Sliq, C4.5, and Treedisc algorithms. The use of expected cost of misclassification (ECM), is introduced as a singular unified measure to compare the performances of different software quality classification models. A function of the costs of the Type I (a nfp module misclassified as fp) and Type II (a fp module misclassified as nfp) misclassifications, ECM is computed for different cost ratios. Evaluating software quality classification models in the presence of varying cost ratios is important, because the usefulness of a model is dependent on the system-specific costs of misclassifications. Moreover, models should be compared and preferred for cost ratios that fall within the range of interest for the given system and project domain. Software metrics were collected from four successive releases of a large legacy telecommunications system. A two-way ANOVA randomized-complete block design modeling approach is used, in which the system release is treated as a block, while the modeling method is treated as a factor. It is observed that predictive performances of the models is significantly different across the system releases, implying that in the software engineering domain prediction models are influenced by the characteristics of the data and the system being modeled. Multiple-pairwise comparisons are performed to evaluate the relative performances of the seven models for the cost ratios of interest to the case study. In addition, the performance of the seven classification techniques is also compared with a classification based on lines of code. The comparative approach presented in this paper can also be applied to other software systems.  相似文献   

13.
The detection of fault-prone programs   总被引:1,自引:0,他引:1  
The use of the statistical technique of discriminant analysis as a tool for the detection of fault-prone programs is explored. A principal-components procedure was employed to reduce simple multicollinear complexity metrics to uncorrelated measures on orthogonal complexity domains. These uncorrelated measures were then used to classify programs into alternate groups, depending on the metric values of the program. The criterion variable for group determination was a quality measure of faults or changes made to the programs. The discriminant analysis was conducted on two distinct data sets from large commercial systems. The basic discriminant model was constructed from deliberately biased data to magnify differences in metric values between the discriminant groups. The technique was successful in classifying programs with a relatively low error rate. While the use of linear regression models has produced models of limited value, this procedure shows great promise for use in the detection of program modules with potential for faults  相似文献   

14.
Empirical validation of software metrics suites to predict fault proneness in object-oriented (OO) components is essential to ensure their practical use in industrial settings. In this paper, we empirically validate three OO metrics suites for their ability to predict software quality in terms of fault-proneness: the Chidamber and Kemerer (CK) metrics, Abreu's Metrics for Object-Oriented Design (MOOD), and Bansiya and Davis' Quality Metrics for Object-Oriented Design (QMOOD). Some CK class metrics have previously been shown to be good predictors of initial OO software quality. However, the other two suites have not been heavily validated except by their original proposers. Here, we explore the ability of these three metrics suites to predict fault-prone classes using defect data for six versions of Rhino, an open-source implementation of JavaScript written in Java. We conclude that the CK and QMOOD suites contain similar components and produce statistical models that are effective in detecting error-prone classes. We also conclude that the class components in the MOOD metrics suite are not good class fault-proneness predictors. Analyzing multivariate binary logistic regression models across six Rhino versions indicates these models may be useful in assessing quality in OO classes produced using modern highly iterative or agile software development processes.  相似文献   

15.
Software Structure Metrics Based on Information Flow   总被引:2,自引:0,他引:2  
Structured design methodologies provide a disciplined and organized guide to the construction of software systems. However, while the methodology structures and documents the points at which design decisions are made, it does not provide a specific, quantitative basis for making these decisions. Typically, the designers' only guidelines are qualitative, perhaps even vague, principles such as "functionality," "data transparency," or "clarity." This paper, like several recent publications, defines and validates a set of software metrics which are appropriate for evaluating the structure of large-scale systems. These metrics are based on the measurement of information flow between system components. Specific metrics are defined for procedure complexity, module complexity, and module coupling. The validation, using the source code for the UNIX operating system, shows that the complexity measures are strongly correlated with the occurrence of changes. Further, the metrics for procedures and modules can be interpreted to reveal various types of structural flaws in the design and implementation.  相似文献   

16.
Large-scale object-oriented (OO) software systems have recently been found to share global network characteristics such as small world and scale free, which go beyond the scope of traditional software measurement and assessment methodologies. To measure the complexity at various levels of granularity, namely graph, class (and object) and source code, we propose a hierarchical set of metrics in terms of coupling and cohesion — the most important characteristics of software, and analyze a sample of 12 open-source OO software systems to empirically validate the set. Experimental results of the correlations between cross-level metrics indicate that the graph measures of our set complement traditional software metrics well from the viewpoint of network thinking, and provide more effective information about fault-prone classes in practice.  相似文献   

17.
基于聚团词的大规模文本转载识别算法   总被引:1,自引:1,他引:0  
文本转载识别是指从大规模文本库中检测出内容相同或相近的文档集合,在热门话题检测、搜索引擎结果凝练、学术文章抄袭识别等诸多应用上,存在普遍的需求。为适应网络文本转载形式的日趋多样化,并进一步提升实用系统效率,对各种文本特征及比较算法进行了研究分析,提出了基于聚团词的大规模文本转载识别算法,即:依据词语的分布属性,识别并提取高得分聚团词用于表征文本,之后通过对文本集进行扩展线性比较与多维比较两次操作,最终筛选出转载识别结果。对比实验表明:该算法在准确率、召回率与效率上有较高的综合性能。  相似文献   

18.
Wang  Jun  Zhao  Zhengyun  Yang  Shangqin  Chai  Xiuli  Zhang  Wanjun  Zhang  Miaohui 《Applied Intelligence》2022,52(6):6208-6226

High-level semantic features and low-level detail features matter for salient object detection in fully convolutional neural networks (FCNs). Further integration of low-level and high-level features increases the ability to map salient object features. In addition, different channels in the same feature are not of equal importance to saliency detection. In this paper, we propose a residual attention learning strategy and a multistage refinement mechanism to gradually refine the coarse prediction in a scale-by-scale manner. First, a global information complementary (GIC) module is designed by integrating low-level detailed features and high-level semantic features. Second, to extract multiscale features of the same layer, a multiscale parallel convolutional (MPC) module is employed. Afterwards, we present a residual attention mechanism module (RAM) to receive the feature maps of adjacent stages, which are from the hybrid feature cascaded aggregation (HFCA) module. The HFCA aims to enhance feature maps, which reduce the loss of spatial details and the impact of varying the shape, scale and position of the object. Finally, we adopt multiscale cross-entropy loss to guide network learning salient features. Experimental results on six benchmark datasets demonstrate that the proposed method significantly outperforms 15 state-of-the-art methods under various evaluation metrics.

  相似文献   

19.
Predicting fault-prone software modules in telephone switches   总被引:1,自引:0,他引:1  
An empirical study was carried out at Ericsson Telecom AB to investigate the relationship between several design metrics and the number of function test failure reports associated with software modules. A tool, ERIMET, was developed to analyze the design documents automatically. Preliminary results from the study of 130 modules showed that: based on fault and design data one can satisfactorily build, before coding has started, a prediction model for identifying the most fault-prone modules. The data analyzed show that 20 percent of the most fault-prone modules account for 60 percent of all faults. The prediction model built in this paper would have identified 20 percent of the modules accounting for 47 percent of all faults. At least four design measures can alternatively be used as predictors with equivalent performance. The size (with respect to the number of lines of code) used in a previous prediction model was not significantly better than these four measures. The Alberg diagram introduced in this paper offers a way of assessing a predictor based on historical data, which is a valuable complement to linear regression when prediction data is ordinal. Applying the method described in this paper makes it possible to use measures at the design phase to predict the most fault-prone modules  相似文献   

20.
ContextIn a large object-oriented software system, packages play the role of modules which group related classes together to provide well-identified services to the rest of the system. In this context, it is widely believed that modularization has a large influence on the quality of packages. Recently, Sarkar, Kak, and Rama proposed a set of new metrics to characterize the modularization quality of packages from important perspectives such as inter-module call traffic, state access violations, fragile base-class design, programming to interface, and plugin pollution. These package-modularization metrics are quite different from traditional package-level metrics, which measure software quality mainly from size, extensibility, responsibility, independence, abstractness, and instability perspectives. As such, it is expected that these package-modularization metrics should be useful predictors for fault-proneness. However, little is currently known on their actual usefulness for fault-proneness prediction, especially compared with traditional package-level metrics.ObjectiveIn this paper, we examine the role of these new package-modularization metrics for determining software fault-proneness in object-oriented systems.MethodWe first use principal component analysis to analyze whether these new package-modularization metrics capture additional information compared with traditional package-level metrics. Second, we employ univariate prediction models to investigate how these new package-modularization metrics are related to fault-proneness. Finally, we build multivariate prediction models to examine the ability of these new package-modularization metrics for predicting fault-prone packages.ResultsOur results, based on six open-source object-oriented software systems, show that: (1) these new package-modularization metrics provide new and complementary views of software complexity compared with traditional package-level metrics; (2) most of these new package-modularization metrics have a significant association with fault-proneness in an expected direction; and (3) these new package-modularization metrics can substantially improve the effectiveness of fault-proneness prediction when used with traditional package-level metrics together.ConclusionsThe package-modularization metrics proposed by Sarkar, Kak, and Rama are useful for practitioners to develop quality software systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号