期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An approach and tool for measurement of state variable based data-flow test coverage for aspect-oriented programs

《Information and Software Technology》2015

ContextData-flow testing approaches have been used for procedural and object-oriented programs, and shown to be effective in detecting faults. However, few such approaches have been evaluated for aspect-oriented programs. In such programs, data-flow interactions can occur between base classes and aspects, which can affect the behavior of both. Faults resulting from such interactions are hard to detect unless the interactions are specifically targeted during testing.ObjectiveThis paper presents an approach and tool implementation for measuring data-flow coverage based on state variables defined in base classes or aspects in AspectJ programs. The paper also reports on an empirical study that compares the cost and effectiveness of data-flow test criteria that are based on state variables with two control-flow criteria.MethodEffectiveness of the criteria was evaluated for various fault types. Cost-effectiveness of test suites that cover all state variable definition-use associations (DUAs) was evaluated for three coverage levels: 100%, 90%, and 80%.ResultsThe effort needed to obtain a test case that achieves data-flow coverage is higher than the effort needed to obtain a test case that covers a block or a branch in an advised class. Covering certain data flow associations requires more effort than for other types of data flow associations. The data-flow test criteria based on state variables of a base-class are in general more effective than control-flow criteria.ConclusionsOverall, it is cost-effective to obtain test suites at the 90% coverage level of data-flow criteria. 相似文献

2.

Comparing coverage criteria for dynamic web application: An empirical evaluation

《Computer Standards & Interfaces》2021

Web applications have become popular and a preferred mean for users to do various crucial tasks such as selling and buying goods, doing short tasks, controlling smart houses and bank account management. The correctness of all such applications is important and requires thorough testing. Structural testing is widely used to achieve correctness in traditional software's, however, for web applications, it is challenging because of its dynamic and heterogeneous nature. To achieve desired structural coverage of web applications different dynamic coverage criteria are used as a quality assessment indicator. However, there is a lack of empirical evidence regarding the effectiveness of the proposed coverage criteria. In this paper, we conduct an empirical evaluation by evaluating and comparing the fault detection effectiveness and efficiency of various dynamic coverage criteria by performing mutation analysis. We conduct a series of experiments to assess and compare four widely used coverage criteria on seven open-source case studies including small to large scale applications. We performed mutation analysis by first generating different faulty versions (mutants) for the case studies and then by executing test suites to record mutation score for each criterion. The results from most of the subject applications show that DOM coverage is the most effective and efficient criterion followed by Virtual DOM, HTML Element and Statement coverage criteria. 相似文献

3.

A unified framework for evaluating test criteria in model-checking-assisted test case generation

Bolong Zeng Li Tan 《Information Systems Frontiers》2014,16(5):823-834

Testing is often cited as one of the most costly operations in testing dependable systems (Heimdahl et al. 2001). A particular challenging task in testing is test-case generation. To improve the efficiency of test-case generation and reduce its cost, recently automated formal verification techniques such as model checking are extended to automate test-case generation processes. In model-checking-assisted test-case generation, a test criterion is formulated as temporal logical formulae, which are used by a model checker to generate test cases satisfying the test criterion. Traditional test criteria such as branch coverage criterion and newer temporal-logic-inspired criteria such as property coverage criteria (Tan et al. 2004) are used with model-checking-assisted test generation. Two key questions in model-checking-assisted test generation are how efficiently a model checker may generate test suites for these criteria and how effective these test suites are. To answer these questions, we developed a unified framework for evaluating (1) the effectiveness of the test criteria used with model-checking-assisted test-case generation and (2) the efficiency of test-case generation for these criteria. The benefits of this work are three-fold: first, the computational study carried out in this work provides some measurements of the effectiveness and efficiency of various test criteria used with model-checking-assisted test case generation. These performance measurements are important factors to consider when a practitioner selects appropriate test criteria for an application of model-checking-assisted test generation. Second, we propose a unified test generation framework based on generalized Büchi automata. The framework uses the same model checker, in this case, SPIN model checker (Holzmann 1997), to generate test cases for different criteria and compare them on a consistent basis. Last but not least, we describe in great details the methodology and automated test generation environment that we developed on the basis of our unified framework. Such details would be of interest to researchers and practitioners who want to use and extend this unified framework and its accompanying tools. 相似文献

4.

Comprehensive analysis of FBD test coverage criteria using mutants

Donghwan Shin Eunkyoung Jee Doo-Hwan Bae 《Software and Systems Modeling》2016,15(3):631-645

Function block diagram (FBD), a graphical modeling language for programmable logic controllers, has been widely used to implement safety critical system software such as nuclear reactor protection systems. With the growing importance of structural testing for FBD models, structural test coverage criteria for FBD models have been proposed and evaluated using mutation analysis in our previous work. We extend the previous work by comprehensively analyzing the relationships among fault detection effectiveness, test suite size, and coverage level through several research questions. We generate a large number of test suites achieving an FBD test coverage ranging from 0 to 100 %, and we also generate many artificial faults (i.e. mutants) for the FBD models. Our analysis results show that the fault detection effectiveness of the FBD coverage criteria increases with increasing coverage levels, and the coverage criteria are highly effective at detecting faults in all subject models. Furthermore, the test suites generated with the FBD coverage criteria are more effective and efficient than the randomly generated test suites. The FBD coverage criteria are strong at detecting faults in Boolean edges, while relatively weak at detecting wrong constants in FBD models. Empirical knowledge regarding our experiments provide the validity of using the FBD coverage criteria, and therefore, of FBD model-based testing. 相似文献

5.

Analysing the effectiveness of rule-coverage as a reduction criterion for test suites of grammar-based software 总被引：1，自引：0，他引：1

Mark Hennessy James F. Power 《Empirical Software Engineering》2008,13(4):343-368

The term grammar-based software describes software whose input can be specified by a context-free grammar. This grammar may occur explicitly in the software, in the form of an input specification to a parser generator, or implicitly, in the form of a hand-written parser. Grammar-based software includes not only programming language compilers, but also tools for program analysis, reverse engineering, software metrics and documentation generation. Hence, ensuring their completeness and correctness is a vital prerequisite for their use. In this paper we propose a strategy for the construction of test suites for grammar based software, and illustrate this strategy using the ISO C^+ + grammar. We use the concept of grammar-rule coverage as a pivot for the reduction of an implementation-based test suite, and demonstrate a significant decrease in the size of this suite. The effectiveness of this reduced test suite is compared to the original test suite with respect to code coverage and more importantly, fault detection. This work greatly expands upon previous work in this area and utilises large scale mutation testing to compare the effectiveness of grammar-rule coverage to that of statement coverage as a reduction criterion for test suites of grammar-based software. This work finds that when grammar rule coverage is used as the sole criterion for reducing test suites of grammar based software, the fault detection capability of that reduced test suite is greatly diminished when compared to other coverage criteria such as statement coverage.

James F. PowerEmail:

相似文献

6.

A detailed investigation of the effectiveness of <Emphasis Type="Italic">whole</Emphasis> test suite generation

José Miguel Rojas Mattia Vivanti Andrea Arcuri Gordon Fraser 《Empirical Software Engineering》2017,22(2):852-893

A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., lines, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole test suite generation optimizes entire test suites towards satisfying all goals at the same time. There is evidence that the overall coverage achieved with this approach is superior to that of targeting individual coverage goals. Nevertheless, there remains some uncertainty on (a) whether the results generalize beyond branch coverage, (b) whether the whole test suite approach might be inferior to a more focused search for some particular coverage goals, and (c) whether generating whole test suites could be optimized by only targeting coverage goals not already covered. In this paper, we perform an in-depth analysis to study these questions. An empirical study on 100 Java classes using three different coverage criteria reveals that indeed there are some testing goals that are only covered by the traditional approach, although their number is only very small in comparison with those which are exclusively covered by the whole test suite approach. We find that keeping an archive of already covered goals along with the tests covering them and focusing the search on uncovered goals overcomes this small drawback on larger classes, leading to an improved overall effectiveness of whole test suite generation. 相似文献

7.

A density‐based greedy algorithm for higher strength covering arrays

Rene C. Bryce Charles J. Colbourn 《Software Testing, Verification and Reliability》2009,19(1):37-53

Algorithmic construction of software interaction test suites has focussed on pairwise coverage; less is known about the efficient construction of test suites for t‐way interactions with t≥3. This study extends an efficient density‐based algorithm for pairwise coverage to generate t‐way interaction test suites and shows that it guarantees a logarithmic upper bound on the size of the test suites as a function of the number of factors. To complement this theoretical guarantee, an implementation is outlined and some practical improvements are made. Computational comparisons with other published methods are reported. Many of the results improve upon those in the literature. However, limitations on the ability of one‐test‐at‐a‐time algorithms are also identified. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

8.

A Global Algorithm for Model-Based Test Suite Generation

Anders Hessel Paul Pettersson 《Electronic Notes in Theoretical Computer Science》2007,190(2):47

Model-based testing has been proposed as a technique to automatically verify that a system conforms to its specification. A popular approach is to use a model-checker to produce a set of test cases by formulating the test generation problem as a reachability problem. To guide the selection of test cases, a coverage criterion is often used. A coverage criterion can be seen as a set of items to be covered, called coverage items. We propose an on-the-fly algorithm that generates a test suite that covers all feasible coverage items. The algorithm returns a set of traces that includes a path fulfilling each item, without including redundant paths. The reachability algorithm explores a state only if it might increase the total coverage. The decision is global in the sense that it does not only regard each individual local search branch in isolation, but the total coverage in all branches together. For simpler coverage criteria as location of edge coverage, this implies that each model state is never explored twice.The algorithm presented in this paper has been implemented in the test generation tool Uppaal coer. We present encouraging results from applying the tool to a set of experiments and in an industrial sized case study. 相似文献

9.

UCov: a user‐defined coverage criterion for test case intent verification

Rawad Abou Assi Wes Masri Fadi Zaraket 《Software Testing, Verification and Reliability》2016,26(6):460-491

The goal of regression testing is to ensure that the behaviour of existing code, believed correct by previous testing, is not altered by new program changes. This paper argues that the primary focus of regression testing should be on code associated with (1) earlier bug fixes and (2) particular application scenarios considered to be important by the developer or tester. Existing coverage criteria do not enable such focus, for example, 100% branch coverage does not guarantee that a given bug fix is exercised or a given application scenario is tested. Therefore, there is a need for a new and complementary coverage criterion in which the user can definea test requirement characterizing a given behaviour to be covered as opposed to choosing from a pool of pre‐defined and generic program elements. This paper proposes this new methodology and calls it UCov, a user‐defined coverage criterion wherein a test requirement is an execution pattern of program elements, and possibly predicates, that a test case must satisfy. The proposed criterion is not meant to replace existing criteria, but to complement them as it focuses the testing on important code patterns that could go untested otherwise. UCov supports test case intent verification. For example, following a bug fix, the testing team may augment the regression suite with the test case that revealed the bug. However, this test case might become obsolete due to code modifications not related to the bug. But if a test requirement characterizing the bug was defined by the user, UCov would determine that test case intent verification failed. The UCov methodology was implemented for the Java platform, was successfully applied onto 10 real‐life case studies and was shown to have advantages over JUnit. The implementation comprises the following tools: (1) TRSpec: allows the user to easily specify complex test requirements; (2) TRCheck: checks whether user‐defined test requirements were satisfied, that is, supports test case intent verification; and (3) TRMigrate: migrates user‐defined test requirements to subsequent versions of a given program. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

10.

A novel strategy for automatic test data generation using soft computing technique

Priyanka CHAWLA Inderveer CHANA Ajay RANA 《Frontiers of Computer Science》2015,9(3):346

Software testing is one of the most crucial and analytical aspect to assure that developed software meets prescribed quality standards. Software development process invests at least 50% of the total cost in software testing process. Optimum and efficacious test data design of software is an important and challenging activity due to the nonlinear structure of software. Moreover, test case type and scope determines the quality of test data. To address this issue, software testing tools should employ intelligence based soft computing techniques like particle swarm optimization (PSO) and genetic algorithm (GA) to generate smart and efficient test data automatically. This paper presents a hybrid PSO and GA based heuristic for automatic generation of test suites. In this paper, we described the design and implementation of the proposed strategy and evaluated our model by performing experiments with ten container classes from the Java standard library. We analyzed our algorithm statistically with test adequacy criterion as branch coverage. The performance adequacy criterion is taken as percentage coverage per unit time and percentage of faults detected by the generated test data. We have compared our work with the heuristic based upon GA, PSO, existing hybrid strategies based on GA and PSO and memetic algorithm. The results showed that the test case generation is efficient in our work. 相似文献

11.

Defining a test coverage criterion for model-level testing of FBD programs

《Information and Software Technology》2013,55(11):2013-2027

ContextThe Programmable Logic Controller (PLC) is being integrated into the automation and control of computer systems in safety–critical domains at an increasing rate. Thoroughly testing such software to ensure safety is crucial. Function Block Diagram (FBD) is a popular data-flow programming language for PLC. Current practice often involves translating an FBD program into an equivalent C program for testing. Little research has been conducted on coverage of direct testing a data-flow program, such as an FBD program, at the model level. There are no commonly accepted structural test coverage criteria for data-flow programs. The objective of this study is to develop effective structural test coverage criterion for testing model-level FBD programs. The proposed testing scheme can be used to detect mutation errors at the logical function level.ObjectiveThe purpose of this study is to design a new test coverage criterion that can directly test FBD programs and effectively detect logical function mutation errors.MethodA complete test set for each function and function block in an FBD program are defined. Moreover, this method augments the data-flow path concept with a sensitivity check to avoid fault masking and effectively detect logical function mutation errors.ResultsPreliminary experiments show that this test coverage criterion is comprehensive and effective for error detection.ConclusionThe proposed coverage criterion is general and can be applied to real cases to improve the quality of data-flow program design. 相似文献

12.

Improving Fault Detection Capability by Selectively Retaining Test Cases during Test Suite Reduction

Jeffrey D. Gupta N. 《IEEE transactions on pattern analysis and machine intelligence》2007,33(2):108-123

Software testing is a critical part of software development. As new test cases are generated over time due to software modifications, test suite sizes may grow significantly. Because of time and resource constraints for testing, test suite minimization techniques are needed to remove those test cases from a suite that, due to code modifications over time, have become redundant with respect to the coverage of testing requirements for which they were generated. Prior work has shown that test suite minimization with respect to a given testing criterion can significantly diminish the fault detection effectiveness (FDE) of suites. We present a new approach for test suite reduction that attempts to use additional coverage information of test cases to selectively keep some additional test cases in the reduced suites that are redundant with respect to the testing criteria used for suite minimization, with the goal of improving the FDE retention of the reduced suites. We implemented our approach by modifying an existing heuristic for test suite minimization. Our experiments show that our approach can significantly improve the FDE of reduced test suites without severely affecting the extent of suite size reduction 相似文献

13.

Behaviour abstraction adequacy criteria for API call protocol testing

Hernan Czemerinski Victor Braberman Sebastian Uchitel 《Software Testing, Verification and Reliability》2016,26(3):211-244

Code artefacts that have non‐trivial requirements with respect to the ordering in which their methods or procedures ought to be called are common and appear, for instance, in the form of API implementations and objects. Testing such code artefacts to gain confidence that they conform to their intended protocols is an important and challenging problem. This paper proposes conformance testing adequacy criteria based on covering an abstraction of the intended behaviour's semantics. Thus, the criteria are independent of the specification language and structure used to describe the intended protocol and the language used to implement it. As a consequence, the results may be of use to black box conformance testing approaches in general. Experimental results show that the criteria are a good predictor for fault detection for protocol conformance and for classical structural coverage criteria such as statement and branch coverage. They also show that the division of the domain derived from the criterion produces subdomains such that most of its inputs are fault revealing. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

14.

On the use of a similarity function for test case selection in the context of model‐based testing

Emanuela G. Cartaxo Patrícia D. L. Machado Francisco G. Oliveira Neto 《Software Testing, Verification and Reliability》2011,21(2):75-100

Test case selection in model‐based testing is discussed focusing on the use of a similarity function. Automatically generated test suites usually have redundant test cases. The reason is that test generation algorithms are usually based on structural coverage criteria that are applied exhaustively. These criteria may not be helpful to detect redundant test cases as well as the suites are usually impractical due to the huge number of test cases that can be generated. Both problems are addressed by applying a similarity function. The idea is to keep in the suite the less similar test cases according to a goal that is defined in terms of the intended size of the test suite. The strategy presented is compared with random selection by considering transition‐based and fault‐based coverage. The results show that, in most of the cases, similarity‐based selection can be more effective than random selection when applied to automatically generated test suites. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

15.

Test-suite reduction and prioritization for modified condition/decision coverage

Jones J.A. Harrold M.J. 《IEEE transactions on pattern analysis and machine intelligence》2003,29(3):195-209

Software testing is particularly expensive for developers of high-assurance software, such as software that is produced for commercial airborne systems. One reason for this expense is the Federal Aviation Administration's requirement that test suites be modified condition/decision coverage (MC/DC) adequate. Despite its cost, there is evidence that MC/DC is an effective verification technique and can help to uncover safety faults. As the software is modified and new test cases are added to the test suite, the test suite grows and the cost of regression testing increases. To address the test-suite size problem, researchers have investigated the use of test-suite reduction algorithms, which identify a reduced test suite that provides the same coverage of the software according to some criterion as the original test suite, and test-suite prioritization algorithms, which identify an ordering of the test cases in the test suite according to some criteria or goals. Existing test-suite reduction and prioritization techniques, however, may not be effective in reducing or prioritizing MC/DC-adequate test suites because they do not consider the complexity of the criterion. This paper presents new algorithms for test-suite reduction and prioritization that can be tailored effectively for use with MC/DC. The paper also presents the results of empirical studies of these algorithms. 相似文献

16.

Assessing and generating test sets in terms of behavioural adequacy

Gordon Fraser Neil Walkinshaw 《Software Testing, Verification and Reliability》2015,25(8):749-780

Identifying a finite test set that adequately captures the essential behaviour of a program such that all faults are identified is a well‐established problem. This is traditionally addressed with syntactic adequacy metrics (e.g. branch coverage), but these can be impractical and may be misleading even if they are satisfied. One intuitive notion of adequacy, which has been discussed in theoretical terms over the past three decades, is the idea of behavioural coverage: If it is possible to infer an accurate model of a system from its test executions, then the test set can be deemed to be adequate. Despite its intuitive basis, it has remained almost entirely in the theoretical domain because inferred models have been expected to be exact (generally an infeasible task) and have not allowed for any pragmatic interim measures of adequacy to guide test set generation. This paper presents a practical approach to incorporate behavioural coverage. Our BESTEST approach (1) enables the use of machine learning algorithms to augment standard syntactic testing approaches and (2) shows how search‐based testing techniques can be applied to generate test sets with respect to this criterion. An empirical study on a selection of Java units demonstrates that test sets with higher behavioural coverage significantly outperform current baseline test criteria in terms of detected faults. © 2015 The Authors. Software Testing, Verification and Reliability published by John Wiley & Sons, Ltd. 相似文献

17.

Generating semantically valid test inputs using constrained input grammars

《Information and Software Technology》2015

ContextGenerating test cases based on software input interface is a black-box testing technique that can be made more effective by using structured input models such as input grammars. Automatically generating grammar-based test inputs may lead to structurally valid but semantically invalid inputs that may be rejected in early semantic error checking phases of a system under test.ObjectiveThis paper aims to introduce a method for specifying a grammar-based input model with the model’s semantic constraints to be used in the generation of positive test inputs. It is also important that the method can generate effective test suites based on appropriate grammar-based coverage criteria.MethodFormal specification of both input structure and input semantics provides the opportunity to use model instantiation techniques to create model instances that satisfy all specified constraints. The input interface of a subject system can be specified using a high-level specification scheme such as attribute grammars, and a transformation function from this scheme to an instantiable formal modeling language can generate the desired model instances.ResultsWe propose a declarative grammar-based input specification method that is based on a variation of attribute grammars and allows the user to specify input constraints in addition to input structure. The model can be instantiated automatically to generate structurally and semantically valid test inputs. The proposed method has the capability to specify test requirements and coverage criteria and use them to generate valid test suites that satisfy test coverage criteria requirements.ConclusionThe work presented in this paper provides a black-box test generation method for grammar-based software inputs that can automatically generate criteria-covering test suites. 相似文献

18.

Testing Planning Domains (without Model Checkers)

Franco Raimondi Charles Pecheur Guillaume Brat 《Electronic Notes in Theoretical Computer Science》2007,190(2):113

We address the problem of verifying planning domains as used in model-based planning, for example in space missions. We propose a methodology for testing flight rules of planning domains which is self-contained, in the sense that flight rules are verified using a planner and no external tools are required. We review and analyse coverage conditions for requirements-based testing, and we reason in detail on "Unique First Cause" (UFC) coverage for test suites. We characterise flight rules using patterns, encoded using LTL, and we provide UFC coverage for them. We then present a translation of LTL formulae into planning goals, and illustrate our approach on a case study. 相似文献

19.

Evolutionary test data generation: a comparison of fitness functions

Alison Watkins Ellen M. Hufnagel 《Software》2006,36(1):95-116

Previous research using genetic algorithms to automate the generation of data for path testing has utilized several different fitness functions, assessing their usefulness by comparing them to random generation. This paper describes two sets of experiments that assess the performance of several fitness functions, relative to one another and to random generation. The results demonstrate that some fitness functions provide better results than others, generating fewer test cases to exercise a given program path. In these studies, the branch predicate and inverse path probability approaches were the best performers, suggesting that a two‐step process combining these two methods may be the most efficient and effective approach to path testing. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

20.

Ct coverage — initial results

W. G. Bently E. F. Miller 《Software Quality Journal》1993,2(1):29-47

Software testing techniques based upon automated analysis of control flow are useful for improving software reliability. The fundamental types of control flow analysis, in ascending order of effectiveness, are statement, branch, and path coverage. Automated tools that perform branch coverage analysis are now accepted practice and researchers are exploring the area that exists between branch and path analysis. Path testing is complicated by the huge number of paths in ordinary programs and by anomalies, such as infeasible paths. The Ct testing strategy is a method for obtaining a manageable set of path classes by specifying a minimum iteration count k. The Ct test coverage metric is a measurement of the proportion of Ct path classes that are exercised within a program. This paper describes an initial experiment in which a special, automated tool was used to evaluate the Ct coverage of tests for a C program of significant size. The results of the study showed that high values of branch coverage may not necessarily imply high path coverage, that approximately 15% of the functions analysed were too complex to analyse effectively, and that the majority of the remaining functions achieved on the order of 11% Ct k=1 coverage. Experimental work in the attainment of high Ct coverage suggests the development of a new, more efficient software testing strategy, which combines Ct path and data flow analysis. 相似文献