首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The increasing level of automation in critical infrastructures requires development of effective ways for finding faults in safety critical software components. Synchronization in concurrent components is especially prone to errors and, due to difficulty of exploring all thread interleavings, it is difficult to find synchronization faults. In this paper we present an experimental study demonstrating the effectiveness of model checking techniques in finding synchronization faults in safety critical software when they are combined with a design for verification approach. We based our experiments on an automated air traffic control software component called the Tactical Separation Assisted Flight Environment (TSAFE). We first reengineered TSAFE using the concurrency controller design pattern. The concurrency controller design pattern enables a modular verification strategy by decoupling the behaviors of the concurrency controllers from the behaviors of the threads that use them using interfaces specified as finite state machines. The behavior of a concurrency controller is verified with respect to arbitrary numbers of threads using the infinite state model checking techniques implemented in the Action Language Verifier (ALV). The threads which use the controller classes are checked for interface violations using the finite state model checking techniques implemented in the Java Path Finder (JPF). We present techniques for thread isolation which enables us to analyze each thread in the program separately during interface verification. We conducted two sets of experiments using these verification techniques. First, we created 40 faulty versions of TSAFE using manual fault seeding. During this exercise we also developed a classification of faults that can be found using the presented design for verification approach. Next, we generated another 100 faulty versions of TSAFE using randomly seeded faults that were created automatically based on this fault classification. We used both infinite and finite state verification techniques for finding the seeded faults. The results of our experiments demonstrate the effectiveness of the presented design for verification approach in eliminating synchronization faults.  相似文献   

2.
Error flow analysis and testing techniques focus on the introduction of errors through code faults into data states of an executing program, and their subsequent cancellation or propagation to output. The goals and limitations of several error flow techniques are discussed, including mutation analysis, fault-based testing, PIE analysis, and dynamic impact analysis. The attributes desired of a good error flow technique are proposed, and a model called dynamic error flow analysis (DEFA) is described that embodies many of these attributes. A testing strategy is proposed that uses DEFA information to select an optimal set of test paths and to quantify the results of successful testing. An experiment is presented that illustrates this testing strategy. In this experiment, the proposed testing strategy outperforms mutation testing in catching arbitrary data state errors.  相似文献   

3.
孙昌爱  吴思懿  张守峰  付安 《软件学报》2024,35(6):2844-2862
BPEL (business process execution language)是一种可执行的Web服务组合语言. 与传统程序相比, BPEL程序在编程模型、执行方式等方面存在较大差异. 这些新特点使得如何定位并修改测试阶段发现的BPEL程序故障成为挑战, 面向传统软件的故障修复技术难以直接应用于BPEL程序. 从变异分析角度出发, 提出一种基于模板匹配的BPEL程序故障修复方法BPELRepair. 为了克服基于变异分析的故障修复技术计算开销高的缺点, 从补丁生成、测试用例选择以及终止条件3个角度提出多种优化策略. 开发一个BPEL故障修复支持工具, 提高故障修复的自动化程度与效率. 采用经验研究的方式, 评估所提故障修复技术及优化策略的有效性. 实验结果表明, 所提故障修复方法能够成功修复约53%的BPEL程序故障; 所提优化策略能够显著降低搜索匹配、补丁程序验证、测试用例执行与故障修复等方面的开销.  相似文献   

4.
A Replicated Experiment to Assess Requirements Inspection Techniques   总被引:4,自引:2,他引:2  
This paper presents the independent replication of a controlled experiment which compared three defect detection techniques (Ad Hoc, Checklist, and Defect-based Scenario) for software requirements inspections, and evaluated the benefits of collection meetings after individual reviews. The results of our replication were partially different from those of the original experiment. Unlike the original experiment, we did not find any empirical evidence of better performance when using scenarios. To explain these negative findings we provide a list of hypotheses. On the other hand, the replication confirmed one result of the original experiment: the defect detection rate is not improved by collection meetings.The independent replication was made possible by the existence of an experimental kit provided by the original investigators. We discuss what difficulties we encountered in applying the package to our environment, as a result of different cultures and skills. Using our results, experience and suggestions, other researchers will be able to improve the original experimental design before attempting further replications.  相似文献   

5.
Software fault prediction using different techniques has been done by various researchers previously. It is observed that the performance of these techniques varied from dataset to dataset, which make them inconsistent for fault prediction in the unknown software project. On the other hand, use of ensemble method for software fault prediction can be very effective, as it takes the advantage of different techniques for the given dataset to come up with better prediction results compared to individual technique. Many works are available on binary class software fault prediction (faulty or non-faulty prediction) using ensemble methods, but the use of ensemble methods for the prediction of number of faults has not been explored so far. The objective of this work is to present a system using the ensemble of various learning techniques for predicting the number of faults in given software modules. We present a heterogeneous ensemble method for the prediction of number of faults and use a linear combination rule and a non-linear combination rule based approaches for the ensemble. The study is designed and conducted for different software fault datasets accumulated from the publicly available data repositories. The results indicate that the presented system predicted number of faults with higher accuracy. The results are consistent across all the datasets. We also use prediction at level l (Pred(l)), and measure of completeness to evaluate the results. Pred(l) shows the number of modules in a dataset for which average relative error value is less than or equal to a threshold value l. The results of prediction at level l analysis and measure of completeness analysis have also confirmed the effectiveness of the presented system for the prediction of number of faults. Compared to the single fault prediction technique, ensemble methods produced improved performance for the prediction of number of software faults. Main impact of this work is to allow better utilization of testing resources helping in early and quick identification of most of the faults in the software system.  相似文献   

6.
ContextReplication plays an important role in experimental disciplines. There are still many uncertainties about how to proceed with replications of SE experiments. Should replicators reuse the baseline experiment materials? How much liaison should there be among the original and replicating experimenters, if any? What elements of the experimental configuration can be changed for the experiment to be considered a replication rather than a new experiment?ObjectiveTo improve our understanding of SE experiment replication, in this work we propose a classification which is intend to provide experimenters with guidance about what types of replication they can perform.MethodThe research approach followed is structured according to the following activities: (1) a literature review of experiment replication in SE and in other disciplines, (2) identification of typical elements that compose an experimental configuration, (3) identification of different replications purposes and (4) development of a classification of experiment replications for SE.ResultsWe propose a classification of replications which provides experimenters in SE with guidance about what changes can they make in a replication and, based on these, what verification purposes such a replication can serve. The proposed classification helped to accommodate opposing views within a broader framework, it is capable of accounting for less similar replications to more similar ones regarding the baseline experiment.ConclusionThe aim of replication is to verify results, but different types of replication serve special verification purposes and afford different degrees of change. Each replication type helps to discover particular experimental conditions that might influence the results. The proposed classification can be used to identify changes in a replication and, based on these, understand the level of verification.  相似文献   

7.
Fault-based testing attempts to show that particular faults cannot exist in software by using test sets that differentiate between the original program (hypothesized to be correct) and faulty alternate programs. The success of this approach depends on a number of assumptions, notably that programmers are competent insofar as they only commit relatively trivial faults, and that faults only couple infrequently. Fault coupling occurs when test sets are able to differentiate between the original program and faulty alternate programs when faults occur in isolation, but not when they occur in combination; it is a complicating factor in fault-based testing. Fault coupling is studied here within the context of finite bijective functions. A complete mathematical solution of the problem is possible in this simplified case; the results indicate that fault coupling does indeed occur infrequently, and are thus in agreement with the empirical results obtained by others in the field. One surprising result is that certain kinds of test set are able to avoid fault coupling altogether.  相似文献   

8.
Speaker verification (SV) using i-vector concept becomes state-of-the-art. In this technique, speakers are projected onto the total variability space and represented by vectors called i-vectors. During testing, the i-vectors of the test speech segment and claimant are conditioned to compensate for the session variability before scoring. So, i-vector system can be viewed as two processing blocks: one is total variability space and the other is post-processing module. Several questions arise, such as, (i) which part of the i-vector system plays a major role in speaker verification: total variability space or post-processing task; (ii) is the post-processing module intrinsic to the total variability space? The motivation of this paper is to partially answer these questions by proposing several simpler speaker characterization systems for speaker verification, where speakers are represented by their speaker characterization vectors (SCVs). The SCVs are obtained by uniform segmentation of the speakers gaussian mixture models (GMMs)- and maximum likelihood linear regression (MLLR) super-vectors. We consider two adaptation approaches for GMM super-vector: one is maximum a posteriori and other is MLLR. Similarly to the i-vector, SCVs are post-processed for session variability compensation during testing. The proposed system shows promising performance when compared to the classical i-vector system which indicates that the post-processing task plays an major role in i-vector based SV system and is not intrinsic to the total variability space. All experimental results are shown on NIST 2008 SRE core condition.  相似文献   

9.
Parallel programs are intrinsically non-deterministic, and therefore the techniques of cyclical debugging that are commonly used for sequential programs are not suitable for parallel ones. This paper proposes a method to reproduce Occam program behaviour. Saving information on the timer values input by the program and the guards selected at run-time on alternative commands allows program replay, i.e. it makes it possible to re-execute the program deterministically with the same inputs following the same instruction path. This enables the software developer to use tools such as debuggers and intrusive monitors to help identify program faults. After discussing possible implementations of the proposed technique, IRD (an interactive replay debugger for Occam programs) is described. Finally, the use of the IRD in a sample debug session is presented as an example.  相似文献   

10.
Two experimental comparisons of data flow and mutation testing are presented. These techniques are widely considered to be effective for unit-level software testing, but can only be analytically compared to a limited extent. We compare the techniques by evaluating the effectiveness of test data developed for each. We develop ten independent sets of test data for a number of programs: five to satisfy the mutation criterion and five to satisfy the all-uses data-flow criterion. These test sets are developed using automated tools, in a manner consistent with the way a test engineer might be expected to generate test data in practice. We use these test sets in two separate experiments. First we measure the effectiveness of the test data that was developed for one technique in terms of the other. Second, we investigate the ability of the test sets to find faults. We place a number of faults into each of our subject programs, and measure the number of faults that are detected by the test sets. Our results indicate that while both techniques are effective, mutation-adequate test sets are closer to satisfying the data flow criterion, and detect more faults.  相似文献   

11.
Software inspections have been introduced in software engineering in order to detect faults before testing is performed. Reading techniques provide reviewers in software inspections with guidelines on how they should check the documents under inspection. Several reading techniques with different purposes have been introduced and empirically evaluated. In this paper, we describe a reading technique with the special aim to detect faults that are severe from a user’s point of view. The reading technique is named usage-based reading (UBR) and it can be used to inspect all software artefacts. In the series of experiments, a high-level design document is used. The main focus of the paper is on the third experiment, which investigates the information needed for UBR in the individual preparation and the meeting of software inspections. Hence, the paper discusses (1) the series of three experiments of UBR, (2) the individual preparation of the third experiment, and (3) the meeting part of the third experiment. For each of these three parts, results are produced. The main results are (1) UBR is an efficient and effective reading technique that can be used for user-focused software inspections, (2) UBR is more efficient and effective if the information used for UBR is developed prior to, instead of during the individual preparation, and (3) the meeting affects the UBR inspection in terms of increased effectiveness and decreased efficiency. In summary, the empirical evidence shows that UBR is an efficient and effective reading technique to be used by software organizations that produce software for which the user perceived quality is important.  相似文献   

12.
Fault localization is an important and challenging task during software testing. Among techniques studied in this field, program spectrum based fault localization is a promising approach. To perform spectrum based fault localization, a set of test oracles should be provided, and the effectiveness of fault localization depends highly on the quality of test oracles. Moreover, their effectiveness is usually affected when multiple simultaneous faults are present. Faced with multiple faults it is difficult for developers to determine when to stop the fault localization process. To address these issues, we propose an iterative fault localization process, i.e., an iterative process of selecting test cases for effective fault localization (IPSETFUL), to identify as many faults as possible in the program until the stopping criterion is satisfied. It is performed based on a concept lattice of program spectrum (CLPS) proposed in our previous work. Based on the labeling approach of CLPS, program statements are categorized as dangerous statements, safe statements, and sensitive statements. To identify the faults, developers need to check the dangerous statements. Meantime, developers need to select a set of test cases covering the dangerous or sensitive statements from the original test suite, and a new CLPS is generated for the next iteration. The same process is proceeded in the same way. This iterative process ends until there are no failing tests in the test suite and all statements on the CLPS become safe statements. We conduct an empirical study on several subject programs, and the results show that IPSETFUL can help identifymost of the faults in the program with the given test suite. Moreover, it can save much effort in inspecting unfaulty program statements compared with the existing spectrum based fault localization techniques and the relevant state of the art technique.  相似文献   

13.
Mutation analysis is a software testing technique that requires the tester to generate test data that will find specific, well-defined errors. Mutation testing executes many slightly differing versions, called mutants, of the same program to evaluate the quality of the data used to test the program. Although these mutants are generated and executed efficiently by automated methods, many of the mutants are functionally equivalent to the original program and are not useful for testing. Recognizing and eliminating equivalent mutants is currently done by hand, a time-consuming and arduous task. This problem is currently a major obstacle to the practical application of mutation testing. This paper presents extensions to previous work in detecting equivalent mutants; specifically, algorithms for determining several classes of equivalent mutants are presented, an implementation of these algorithms is discussed, and results from using this implementation are presented. These algorithms are based on data flow analysis and six compiler optimization techniques. Each of these techniques is described together with how they are used to detect equivalent mutants. The design of the tool and some experimental results using it are also presented.  相似文献   

14.
Software health management (SWHM) techniques complement the rigorous verification and validation processes that are applied to safety-critical systems prior to their deployment. These techniques are used to monitor deployed software in its execution environment, serving as the last line of defense against the effects of a critical fault. SWHM monitors use information from the specification and implementation of the monitored software to detect violations, predict possible failures, and help the system recover from faults. Changes to the monitored software, such as adding new functionality or fixing defects, therefore, have the potential to impact the correctness of both the monitored software and the SWHM monitor. In this work, we describe how the results of a software change impact analysis technique, Directed Incremental Symbolic Execution (DiSE ), can be applied to monitored software to identify the potential impact of the changes on the SWHM monitor software. The results of DiSE can then be used by other analysis techniques, e.g., testing, debugging, to help preserve and improve the integrity of the SWHM monitor as the monitored software evolves.  相似文献   

15.
Software testing techniques and criteria are considered complementary since they can reveal different kinds of faults and test distinct aspects of the program. The functional criteria, such as Category Partition, are difficult to be automated and are usually manually applied. Structural and fault-based criteria generally provide measures to evaluate test sets. The existing supporting tools produce a lot of information including: input and produced output, structural coverage, mutation score, faults revealed, etc. However, such information is not linked to functional aspects of the software. In this work, we present an approach based on machine learning techniques to link test results from the application of different testing techniques. The approach groups test data into similar functional clusters. After this, according to the tester's goals, it generates classifiers (rules) that have different uses, including selection and prioritization of test cases. The paper also presents results from experimental evaluations and illustrates such uses.  相似文献   

16.
ContextA replication is the repetition of an experiment. Several efforts have been made to adopt replication as a common practice in software engineering. There are different types of replications, depending on their purpose. Similar replications keep the experimental conditions as alike as possible to the original ones. External similar replications, where the replicating experimenters are not the same people as the original experimenters, have been a stumbling block. Several attempts at combining the results of replications have resulted in failure. Software engineering does not appear to be well suited to such replications, because it works with complex experimentally immature contexts. Software engineering settings have a large number of variables, and the role that many of them play is unknown. A successful (or useful) similar replication helps to better understand the phenomenon under study by verifying results and/or identifying contextual variables that could influence (or not) the results, through the combination of experimental results.ObjectiveTo be able to get successful similar replications, there needs to be interaction between original and replicating experimenters. In this paper, we propose an interaction process for achieving successful similar replications.MethodThis process consists of: an adaptation meeting, where experimenters tailor the experiment to the new setting; querying, to settle occasional inquiries while the experiment is being run; and a combination meeting, where experimenters meet to discuss the combination of replication outcomes with previous results. To check its effectiveness, the process has been tested on three different replications of the same experiment.ResultsThe proposed interaction process has helped to identify new contextual variables that could potentially influence (or not) the experimental results in the three replications run. Additionally, the interaction process has helped to uncover certain problems and deviations that occurred during some of the replications that we would have not been aware of otherwise.ConclusionsThere are signs that suggest that it is possible to get successful similar replications in software engineering experimentation, when there is appropriate interaction among experimenters.  相似文献   

17.
一种回归测试后的错误定位方法   总被引:1,自引:0,他引:1  
测试和调试之间的关系是极端密切的。回归测试是软件测试和维护过程中的一个重要活动。在程序中找出错误是一个复杂的过程,它涉及到理解程序的用途、结构、语意和导致错误的测试的相关特征。本文提出了一种基于Chopping技术进行错误定位的方法。这种方法反复利用调试信息和回归测试结果,通过从程序中抽取出与特定的语句有关的、 、相对原来的程序小得多的语句集,实现准确、快速的错误定位。  相似文献   

18.
Regression testing is an expensive testing procedure utilized to validate modified software. Regression test selection techniques attempt to reduce the cost of regression testing by selecting a subset of a program's existing test suite. Safe regression test selection techniques select subsets that, under certain well-defined conditions, exclude no tests (from the original test suite) that if executed would reveal faults in the modified software. Many regression test selection techniques, including several safe techniques, have been proposed, but few have been subjected to empirical validation. This paper reports empirical studies on a particular safe regression test selection technique, in which the technique is compared to the alternative regression testing strategy of running all tests. The results indicate that safe regression test selection can be cost-effective, but that its costs and benefits vary widely based on a number of factors. In particular, test suite design can significantly affect the effectiveness of test selection, and coverage-based test suites may provide test selection results superior to those provided by test suites that are not coverage-based  相似文献   

19.
软件错误播种方法不仅可以用来评价软件的性能和研究软件错误的特性,而且还可通过播种错误为软件测试方法的评估提供必要的条件。考虑到白盒测试所针对的错误类型是程序代码级错误,为了方便错误播种,将程序代码错误分为计算型错误、域错误和程序接口错误,并针对这3类错误提供了一种改进的基于程序变异的软件错误播种方法。  相似文献   

20.
To contribute to the body of empirical research on fault distributions during development of complex software systems, a replication of a study of Fenton and Ohlsson is conducted. The hypotheses from the original study are investigated using data taken from an environment that differs in terms of system size, project duration, and programming language. We have investigated four sets of hypotheses on data from three successive telecommunications projects: 1) the Pareto principle, that is, a small number of modules contain a majority of the faults (in the replication, the Pareto principle is confirmed), 2) fault persistence between test phases (a high fault incidence in function testing is shown to imply the same in system testing, as well as prerelease versus postrelease fault incidence), 3) the relation between number of faults and lines of code (the size relation from the original study could be neither confirmed nor disproved in the replication), and 4) fault density similarities across test phases and projects (in the replication study, fault densities are confirmed to be similar across projects). Through this replication study, we have contributed to what is known on fault distributions, which seem to be stable across environments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号