期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Empirical studies of a safe regression test selection technique

Rothermel G. Harrold M.J. 《IEEE transactions on pattern analysis and machine intelligence》1998,24(6):401-419

Regression testing is an expensive testing procedure utilized to validate modified software. Regression test selection techniques attempt to reduce the cost of regression testing by selecting a subset of a program's existing test suite. Safe regression test selection techniques select subsets that, under certain well-defined conditions, exclude no tests (from the original test suite) that if executed would reveal faults in the modified software. Many regression test selection techniques, including several safe techniques, have been proposed, but few have been subjected to empirical validation. This paper reports empirical studies on a particular safe regression test selection technique, in which the technique is compared to the alternative regression testing strategy of running all tests. The results indicate that safe regression test selection can be cost-effective, but that its costs and benefits vary widely based on a number of factors. In particular, test suite design can significantly affect the effectiveness of test selection, and coverage-based test suites may provide test selection results superior to those provided by test suites that are not coverage-based 相似文献

2.

Large-scale information retrieval in software engineering - an experience report from industrial application

Michael Unterkalmsteiner Tony Gorschek Robert Feldt Niklas Lavesson 《Empirical Software Engineering》2016,21(6):2324-2365

Software Engineering activities are information intensive. Research proposes Information Retrieval (IR) techniques to support engineers in their daily tasks, such as establishing and maintaining traceability links, fault identification, and software maintenance. We describe an engineering task, test case selection, and illustrate our problem analysis and solution discovery process. The objective of the study is to gain an understanding of to what extent IR techniques (one potential solution) can be applied to test case selection and provide decision support in a large-scale, industrial setting. We analyze, in the context of the studied company, how test case selection is performed and design a series of experiments evaluating the performance of different IR techniques. Each experiment provides lessons learned from implementation, execution, and results, feeding to its successor. The three experiments led to the following observations: 1) there is a lack of research on scalable parameter optimization of IR techniques for software engineering problems; 2) scaling IR techniques to industry data is challenging, in particular for latent semantic analysis; 3) the IR context poses constraints on the empirical evaluation of IR techniques, requiring more research on developing valid statistical approaches. We believe that our experiences in conducting a series of IR experiments with industry grade data are valuable for peer researchers so that they can avoid the pitfalls that we have encountered. Furthermore, we identified challenges that need to be addressed in order to bridge the gap between laboratory IR experiments and real applications of IR in the industry. 相似文献

3.

Recomputing Coverage Information to Assist Regression Testing

Chittimalli Pavan Kumar Harrold Mary Jean 《IEEE transactions on pattern analysis and machine intelligence》2009,35(4):452-469

This paper presents a technique that leverages an existing regression test selection algorithm to compute accurate, updated coverage data on a version of the software, P_{i+1}, without rerunning any test cases that do not execute the changes from the previous version of the software, P_i to P_{i+1}. The technique also reduces the cost of running those test cases that are selected by the regression test selection algorithm by performing a selective instrumentation that reduces the number of probes required to monitor the coverage data. Users of our technique can avoid the expense of rerunning the entire test suite on P_{i+1} or the inaccuracy produced by previous approaches that estimate coverage data for P_{i+1} or that reuse outdated coverage data from P_i. This paper also presents a tool, ReCover, that implements our technique, along with a set of empirical studies on a set of subjects that includes several industrial programs, versions, and test cases. The studies show the inaccuracies that can exist when an application—regression test selection—uses estimated or outdated coverage data. The studies also show that the overhead incurred by selective instrumentation used in our technique is negligible and overall our technique provides savings over earlier techniques. 相似文献

4.

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

Hyunsook Do Rothermel G. 《IEEE transactions on pattern analysis and machine intelligence》2006,32(9):733-752

Regression testing is an important activity in the software life cycle, but it can also be very expensive. To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One potential goal of test case prioritization techniques is to increase a test suite's rate of fault detection (how quickly, in a run of its test cases, that test suite can detect faults). Previous work has shown that prioritization can improve a test suite's rate of fault detection, but the assessment of prioritization techniques has been limited primarily to hand-seeded faults, largely due to the belief that such faults are more realistic than automatically generated (mutation) faults. A recent empirical study, however, suggests that mutation faults can be representative of real faults and that the use of hand-seeded faults can be problematic for the validity of empirical results focusing on fault detection. We have therefore designed and performed two controlled experiments assessing the ability of prioritization techniques to improve the rate of fault detection of test case prioritization techniques, measured relative to mutation faults. Our results show that prioritization can be effective relative to the faults considered, and they expose ways in which that effectiveness can vary with characteristics of faults and test suites. More importantly, a comparison of our results with those collected using hand-seeded faults reveals several implications for researchers performing empirical studies of test case prioritization techniques in particular and testing techniques in general 相似文献

5.

Test case prioritization techniques for model-based testing: a replicated study

João Felipe S. Ouriques Emanuela G. Cartaxo Patrícia D. L. Machado 《Software Quality Journal》2018,26(4):1451-1482

Recently, several test case prioritization (TCP) techniques have been proposed to order test cases for achieving a goal during test execution, particularly, revealing faults sooner. In the model-based testing (MBT) context, such techniques are usually based on heuristics related to structural elements of the model and derived test cases. In this sense, techniques’ performance may vary due to a number of factors. While empirical studies comparing the performance of TCP techniques have already been presented in literature, there is still little knowledge, particularly in the MBT context, about which factors may influence the outcomes suggested by a TCP technique. In a previous family of empirical studies focusing on labeled transition systems, we identified that the model layout, i.e., amount of branches, joins, and loops in the model, alone may have little influence on the effectiveness of TCP techniques investigated, whereas characteristics of test cases that actually fail definitely influences this aspect. However, we considered only synthetic artifacts in the study, which reduced the ability of representing properly the reality. In this paper, we present a replication of one of these studies, now with a larger and more representative selection of techniques and considering test suites from industrial systems as experimental objects. Our objective is to find out whether the results remain while increasing the validity in comparison to the original study. Results reinforce that there is no best performer among the investigated techniques and characteristics of test cases that fail represent an important factor, although adaptive random-based techniques are less affected by it. 相似文献

6.

Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis

下载免费PDF全文

Elmidaoui Sara Cheikhi Laila Idri Ali Abran Alain 《计算机科学技术学报》2020,35(5):1147-1174

Maintaining software once implemented on the end-user side is laborious and, over its lifetime, is most often considerably more expensive than the initial software development. The prediction of software maintainability has emerged as an important research topic to address industry expectations for reducing costs, in particular, maintenance costs. Researchers and practitioners have been working on proposing and identifying a variety of techniques ranging from statistical to machine learning (ML) for better prediction of software maintainability. This review has been carried out to analyze the empirical evidence on the accuracy of software product maintainability prediction (SPMP) using ML techniques. This paper analyzes and discusses the findings of 77 selected studies published from 2000 to 2018 according to the following criteria: maintainability prediction techniques, validation methods, accuracy criteria, overall accuracy of ML techniques, and the techniques offering the best performance. The review process followed the well-known systematic review process. The results show that ML techniques are frequently used in predicting maintainability. In particular, artificial neural network (ANN), support vector machine/regression (SVM/R), regression &; decision trees (DT), and fuzzy &; neuro fuzzy (FNF) techniques are more accurate in terms of PRED and MMRE. The N-fold and leave-one-out cross-validation methods, and the MMRE and PRED accuracy criteria are frequently used in empirical studies. In general, ML techniques outperformed non-machine learning techniques, e.g., regression analysis (RA) techniques, while FNF outperformed SVM/R, DT, and ANN in most experiments. However, while many techniques were reported superior, no specific one can be identified as the best.

相似文献

7.

Reviewing 25 Years of Testing Technique Experiments

Natalia Juristo Ana M. Moreno Sira Vegas 《Empirical Software Engineering》2004,9(1-2):7-44

Mature knowledge allows engineering disciplines the achievement of predictable results. Unfortunately, the type of knowledge used in software engineering can be considered to be of a relatively low maturity, and developers are guided by intuition, fashion or market-speak rather than by facts or undisputed statements proper to an engineering discipline. Testing techniques determine different criteria for selecting the test cases that will be used as input to the system under examination, which means that an effective and efficient selection of test cases conditions the success of the tests. The knowledge for selecting testing techniques should come from studies that empirically justify the benefits and application conditions of the different techniques. This paper analyzes the maturity level of the knowledge about testing techniques by examining existing empirical studies about these techniques. We have analyzed their results, and obtained a testing technique knowledge classification based on their factuality and objectivity, according to four parameters. 相似文献

8.

A stability assessment of solution adaptation techniques for analogy-based software effort estimation

Passakorn Phannachitta Jacky Keung Akito Monden Kenichi Matsumoto 《Empirical Software Engineering》2017,22(1):474-504

Among numerous possible choices of effort estimation methods, analogy-based software effort estimation based on Case-based reasoning is one of the most adopted methods in both the industry and research communities. Solution adaptation is the final step of analogy-based estimation, employed to aggregate and adapt to solutions derived during the case-based reasoning process. Variants of solution adaptation techniques have been proposed in previous studies; however, the ranking of these techniques is not conclusive and shows conflicting results, since different studies rank these techniques in different ways. This paper aims to find a stable ranking of solution adaptation techniques for analogy-based estimation. Compared with the existing studies, we evaluate 8 commonly adopted solution techniques with more datasets (12), more feature selection techniques included (4), and more stable error measures (5) to a robust statistical test method based on the Brunner test. This comprehensive experimental procedure allows us to discover a stable ranking of the techniques applied, and to observe similar behaviors from techniques with similar adaptation mechanisms. In general, the linear adaptation techniques based on the functions of size and productivity (e.g., regression towards the mean technique) outperform the other techniques in a more robust experimental setting adopted in this study. Our empirical results show that project features with strong correlation to effort, such as software size or productivity, should be utilized in the solution adaptation step to achieve desirable performance. Designing a solution adaptation strategy in analogy-based software effort estimation requires careful consideration of those influential features to ensure its prediction is of relevant and accurate. 相似文献

9.

A rough set based approach to distributor selection in supply chain management

Zhonghai Zou Tzu-Liang Tseng Hansuk Sohn Guofang Song Rafael Gutierrez 《Expert systems with applications》2011,38(1):106-115

相似文献

10.

Empirical studies of a prediction model for regression testselection

Harrold M.J. Rosenblum D. Rothermel G. Weyuker E. 《IEEE transactions on pattern analysis and machine intelligence》2001,27(3):248-263

Regression testing is an important activity that can account for a large proportion of the cost of software maintenance. One approach to reducing the cost of regression testing is to employ a selective regression testing technique that: chooses a subset of a test suite that was used to test the software before the modifications; then uses this subset to test the modified software. Selective regression testing techniques reduce the cost of regression testing if the cost of selecting the subset from the test suite together with the cost of running the selected subset of test cases is less than the cost of rerunning the entire test suite. Rosenblum and Weyuker (1997) proposed coverage-based predictors for use in predicting the effectiveness of regression test selection strategies. Using the regression testing cost model of Leung and White (1989; 1990), Rosenblum and Weyuker demonstrated the applicability of these predictors by performing a case study involving 31 versions of the KornShell. To further investigate the applicability of the Rosenblum-Weyuker (RW) predictor, additional empirical studies have been performed. The RW predictor was applied to a number of subjects, using two different selective regression testing tools, Deja vu and TestTube. These studies support two conclusions. First, they show that there is some variability in the success with which the predictors work and second, they suggest that these results can be improved by incorporating information about the distribution of modifications. It is shown how the RW prediction model can be improved to provide such an accounting 相似文献

11.

Using component metadata to regression test component‐based software

Alessandro Orso Hyunsook Do Gregg Rothermel Mary Jean Harrold David S. Rosenblum 《Software Testing, Verification and Reliability》2007,17(2):61-94

Increasingly, modern‐day software systems are being built by combining externally‐developed software components with application‐specific code. For such systems, existing program‐analysis‐based software engineering techniques may not directly apply, due to lack of information about components. To address this problem, the use of component metadata has been proposed. Component metadata are metadata and metamethods provided with components, that retrieve or calculate information about those components. In particular, two component‐metadata‐based approaches for regression test selection are described: one using code‐based component metadata and the other using specification‐based component metadata. The results of empirical studies that illustrate the potential of these techniques to provide savings in re‐testing effort are provided. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献

12.

Analyzing regression test selection techniques 总被引：1，自引：0，他引：1

Rothermel G. Harrold M.J. 《IEEE transactions on pattern analysis and machine intelligence》1996,22(8):529-551

Regression testing is a necessary but expensive maintenance activity aimed at showing that code has not been adversely affected by changes. Regression test selection techniques reuse tests from an existing test suite to test a modified program. Many regression test selection techniques have been proposed, however, it is difficult to compare and evaluate these techniques because they have different goals. This paper outlines the issues relevant to regression test selection techniques, and uses these issues as the basis for a framework within which to evaluate the techniques. The paper illustrates the application of the framework by using it to evaluate existing regression test selection techniques. The evaluation reveals the strengths and weaknesses of existing techniques, and highlights some problems that future work in this area should address 相似文献

13.

基于半监督聚类方法的测试用例选择技术

程雪梅杨秋辉翟宇鹏陈伟《计算机科学》2018,45(1):249-254

回归测试的目的是保证软件修改后没有引入新的错误。但是随着软件的演化,回归测试用例集不断增大,为了控制成本,回归测试用例选择技术应运而生。近年来,聚类分析技术被运用到回归测试用例选择问题中。将半监督学习引入到聚类技术中,提出了判别型半监督K-means聚类方法(Discriminative Semi-supervised K-means clustering Method,DSKM)。该方法从回归测试的历史执行记录中挖掘出隐藏的成对约束信息,同时利用大量的无标签样本和少量的有标签样本进行学习,优化聚类的结果,并进一步优化测试用例选择的结果。实验表明,相对于Constrained-Kmeans方法和SSKM方法,DSKM方法能够更好地提高约简率并保持覆盖率。相似文献

14.

测试用例最小化研究* 总被引：2，自引：0，他引：2

马雪英盛斌奎《计算机应用研究》2007,24(7):35-39

给出了测试用例最小化问题的形式化描述,提出并实现了两个新的用于用例最小化的算法.与现有其他最小化算法不同,这两个算法在考虑了每个用例测试覆盖度的同时,还考虑了用例的测试运行代价,目的是提高最小化效率.最后给出了对这两个算法进行实例研究的实验结果.结果表明,用例最小化技术能有效缩减回归测试用例集的尺寸,大幅度降低回归测试费用,提高最小化效率. 相似文献

15.

Checking the adequacy for a distortion errors-in-variables parametric regression model

《Computational statistics & data analysis》2015

This paper studies tools for checking the validity of a parametric regression model, when both response and predictors are unobserved and distorted in a multiplicative fashion by an observed confounding variable. A residual based empirical process test statistic marked by proper functions of the regressors is proposed. We derive asymptotic distribution of the proposed empirical process test statistic: a centered Gaussian process under the null hypothesis and a non-centered one under local alternatives converging to the null hypothesis at parametric rates. We also suggest a bootstrap procedure to calculate critical values. Simulation studies are conducted to demonstrate the performance of the proposed test statistic and real examples are analyzed for illustrations. 相似文献

16.

Directed test suite augmentation: an empirical investigation

Zhihong Xu Yunho Kim Moonzoo Kim Myra B. Cohen Gregg Rothermel 《Software Testing, Verification and Reliability》2015,25(2):77-114

Test suite augmentation techniques are used in regression testing to identify code elements in a modified program that are not adequately tested and to generate test cases to cover those elements. A defining feature of test suite augmentation techniques is the potential for reusing existing regression test suites. Our preliminary work suggests that several factors influence the efficiency and effectiveness of augmentation techniques that perform such reuse. These include the order in which target code elements are considered while generating test cases, the manner in which existing regression test cases and newly generated test cases are used, and the algorithm used to generate test cases. In this work, we present the results of two empirical studies examining these factors, considering two test case generation algorithms (concolic and genetic). The results of our studies show that the primary factor affecting augmentation using these approaches is the test case generation algorithm utilized; this affects both cost and effectiveness. The manner in which existing and newly generated test cases are utilized also has a substantial effect on efficiency and in some cases a substantial effect on effectiveness. The order in which target code elements are considered turns out to have relatively few effects when using concolic test case generation but in some cases influences the efficiency of genetic test case generation. The results of our first study, on four relatively small programs using a large number of test suites, are supported by our second study of a much larger program available in multiple versions. Together, the studies reveal a potential opportunity for creating a more cost‐effective hybrid augmentation approach leveraging both concolic and genetic test case generation techniques, while appropriately utilizing our understanding of the factors that affect them. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

17.

Selecting a Cost-Effective Test Case Prioritization Technique 总被引：1，自引：0，他引：1

Sebastian Elbaum Gregg Rothermel Satya Kanduri Alexey G. Malishevsky 《Software Quality Journal》2004,12(3):185-210

Regression testing is an expensive testing process used to validate modified software and detect whether new faults have been introduced into previously tested code. To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One goal of prioritization is to increase a test suite's rate of fault detection. Previous empirical studies have shown that several prioritization techniques can significantly improve rate of fault detection, but these studies have also shown that the effectiveness of these techniques varies considerably across various attributes of the program, test suites, and modifications being considered. This variation makes it difficult for a practitioner to choose an appropriate prioritization technique for a given testing scenario. To address this problem, we analyze the fault detection rates that result from applying several different prioritization techniques to several programs and modified versions. The results of our analyses provide insights into which types of prioritization techniques are and are not appropriate under specific testing scenarios, and the conditions under which they are or are not appropriate. Our analysis approach can also be used by other researchers or practitioners to determine the prioritization techniques appropriate to other workloads. 相似文献

18.

Web application testing: A systematic literature review

《Journal of Systems and Software》2014

ContextThe web has had a significant impact on all aspects of our society. As our society relies more and more on the web, the dependability of web applications has become increasingly important. To make these applications more dependable, for the past decade researchers have proposed various techniques for testing web-based software applications. Our literature search for related studies retrieved 193 papers in the area of web application testing, which have appeared between 2000 and 2013.ObjectiveAs this research area matures and the number of related papers increases, it is important to systematically identify, analyze, and classify the publications and provide an overview of the trends and empirical evidence in this specialized field.MethodsWe systematically review the body of knowledge related to functional testing of web application through a systematic literature review (SLR) study. This SLR is a follow-up and complimentary study to a recent systematic mapping (SM) study that we conducted in this area. As part of this study, we pose three sets of research questions, define selection and exclusion criteria, and synthesize the empirical evidence in this area.ResultsOur pool of studies includes a set of 95 papers (from the 193 retrieved papers) published in the area of web application testing between 2000 and 2013. The data extracted during our SLR study is available through a publicly-accessible online repository. Among our results are the followings: (1) the list of test tools in this area and their capabilities, (2) the types of test models and fault models proposed in this domain, (3) the way the empirical studies in this area have been designed and reported, and (4) the state of empirical evidence and industrial relevance.ConclusionWe discuss the emerging trends in web application testing, and discuss the implications for researchers and practitioners in this area. The results of our SLR can help researchers to obtain an overview of existing web application testing approaches, fault models, tools, metrics and empirical evidence, and subsequently identify areas in the field that require more attention from the research community. 相似文献

19.

回归测试中的测试用例优先排序技术述评 总被引：5，自引：4，他引：1

陈翔陈继红鞠小林顾庆《软件学报》2013,24(8):1695-1712

测试用例优先排序(test case prioritization,简称TCP)问题是回归测试研究中的一个热点.通过设定特定排序准则,对测试用例进行排序以优化其执行次序,旨在最大化排序目标,例如最大化测试用例集的早期缺陷检测速率.TCP问题尤其适用于因测试预算不足以致不能执行完所有测试用例的测试场景.首先对TCP问题进行描述,并依次从源代码、需求和模型这3个角度出发对已有的TCP技术进行分类;然后对一类特殊的TCP问题(即测试资源感知的TCP问题)的已有研究成果进行总结;随后依次总结实证研究中常用的评测指标、评测数据集和缺陷类型对实证研究结论的影响;接着依次介绍TCP技术在一些特定测试领域中的应用,包括组合测试、事件驱动型应用测试、Web服务测试和缺陷定位等;最后对下一步工作进行展望. 相似文献

20.

Test case prioritization: a systematic mapping study

Cagatay Catal Deepti Mishra 《Software Quality Journal》2013,21(3):445-478

Test case prioritization techniques, which are used to improve the cost-effectiveness of regression testing, order test cases in such a way that those cases that are expected to outperform others in detecting software faults are run earlier in the testing phase. The objective of this study is to examine what kind of techniques have been widely used in papers on this subject, determine which aspects of test case prioritization have been studied, provide a basis for the improvement of test case prioritization research, and evaluate the current trends of this research area. We searched for papers in the following five electronic databases: IEEE Explorer, ACM Digital Library, Science Direct, Springer, and Wiley. Initially, the search string retrieved 202 studies, but upon further examination of titles and abstracts, 120 papers were identified as related to test case prioritization. There exists a large variety of prioritization techniques in the literature, with coverage-based prioritization techniques (i.e., prioritization in terms of the number of statements, basic blocks, or methods test cases cover) dominating the field. The proportion of papers on model-based techniques is on the rise, yet the growth rate is still slow. The proportion of papers that use datasets from industrial projects is found to be 64 %, while those that utilize public datasets for validation are only 38 %. On the basis of this study, the following recommendations are provided for researchers: (1) Give preference to public datasets rather than proprietary datasets; (2) develop more model-based prioritization methods; (3) conduct more studies on the comparison of prioritization methods; (4) always evaluate the effectiveness of the proposed technique with well-known evaluation metrics and compare the performance with the existing methods; (5) publish surveys and systematic review papers on test case prioritization; and (6) use datasets from industrial projects that represent real industrial problems. 相似文献