首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The term grammar-based software describes software whose input can be specified by a context-free grammar. This grammar may occur explicitly in the software, in the form of an input specification to a parser generator, or implicitly, in the form of a hand-written parser. Grammar-based software includes not only programming language compilers, but also tools for program analysis, reverse engineering, software metrics and documentation generation. Hence, ensuring their completeness and correctness is a vital prerequisite for their use. In this paper we propose a strategy for the construction of test suites for grammar based software, and illustrate this strategy using the ISO C + +  grammar. We use the concept of grammar-rule coverage as a pivot for the reduction of an implementation-based test suite, and demonstrate a significant decrease in the size of this suite. The effectiveness of this reduced test suite is compared to the original test suite with respect to code coverage and more importantly, fault detection. This work greatly expands upon previous work in this area and utilises large scale mutation testing to compare the effectiveness of grammar-rule coverage to that of statement coverage as a reduction criterion for test suites of grammar-based software. This work finds that when grammar rule coverage is used as the sole criterion for reducing test suites of grammar based software, the fault detection capability of that reduced test suite is greatly diminished when compared to other coverage criteria such as statement coverage.
James F. PowerEmail:
  相似文献   

2.
Like all engineering disciplines, software engineering relies on quantitative analysis to support rationalized decision making. Software engineering researchers and practitioners have traditionally relied on software metrics to quantify attributes of software products and processes. Whereas traditional software metrics are typically based on a syntactic analysis of software products, we introduce and discuss metrics that are based on a semantic analysis: our metrics do not reflect the form or structure of software products, but rather the properties of their function. At a time when software systems grow increasingly large and complex, the focus on diagnosing, identifying and removing every fault in the software product ought to relinquish the stage to a more measured, more balanced, and more realistic approach, which emphasizes failure avoidance, in addition to fault avoidance and fault removal. Semantic metrics are a good fit for this purpose, reflecting as they do a system’s ability to avoid failure rather than its proneness to being free of faults.  相似文献   

3.
A state-based approach to integration testing based on UML models   总被引:3,自引:0,他引:3  
Correct functioning of object-oriented software depends upon the successful integration of classes. While individual classes may function correctly, several new faults can arise when these classes are integrated together. In this paper, we present a technique to enhance testing of interactions among modal classes. The technique combines UML collaboration diagrams and statecharts to automatically generate an intermediate test model, called SCOTEM (State COllaboration TEst Model). The SCOTEM is then used to generate valid test paths. We also define various coverage criteria to generate test paths from the SCOTEM model. In order to assess our technique, we have developed a tool and applied it to a case study to investigate its fault detection capability. The results show that the proposed technique effectively detects all the seeded integration faults when complying with the most demanding adequacy criterion and still achieves reasonably good results for less expensive adequacy criteria.  相似文献   

4.
Test‐suite reduction techniques attempt to reduce the costs of saving and reusing test cases during software maintenance by eliminating redundant test cases from test suites. A potential drawback of these techniques is that reducing the size of a test suite might reduce its ability to reveal faults in the software. Previous studies have suggested that test‐suite reduction techniques can reduce test‐suite size without significantly reducing the fault‐detection capabilities of test suites. These studies, however, involved particular programs and types of test suites, and to begin to generalize their results, further work is needed. This paper reports on the design and execution of additional studies, examining the costs and benefits of test‐suite reduction, and the factors that influence these costs and benefits. In contrast to previous studies, results of these studies reveal that the fault‐detection capabilities of test suites can be severely compromised by test‐suite reduction. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

5.
针对软件源代码静态检测时故障报告中误报较多问题,提出一种基于软件运行特征的故障检测方法,通过引入动态分析的方式进行故障检测;首先扩展了动态测试插装库,设计了八种常见故障模式对应的探针函数,然后在程序中搜索故障监控位置并进行故障监控探针的插装,最后在软件执行过程中分析插装消息中的运行特征从而识别故障;实验结果表明该方法能够有效检测程序故障且检测出的故障均为真实存在,弥补了静态分析误报率高的问题。  相似文献   

6.
A light-weight software-implemented fault injection (SWIFI) testing approach is introduced, focusing on technical process faults and system faults. The reaction of automated production systems (aPSs) and their programmable logic controller (PLC) software to these faults is tested. In order to tailor the testing approach to the aPS domain in industrial practice, our test generation is based on a classification of possible deviations, i.e. a classification of possible technical process and system faults as the PLC perceives them. As a result, both specification and test execution become more efficient for practitioners. Furthermore, the test specification is tailored for execution on IEC 61131-3 programming environments. In this, the execution of test cases both against simulation or the real aPS, is enabled.  相似文献   

7.
软件密集型装备中常常包含着许多担负监测和控制作用的嵌入式实时系统,它们常常属于安全关键或者任务关键系统(safety-critical/mission-critical system)。为了能够有效解决该类系统中的软件故障检测、诊断与修复任务,本文提出了基于Multi-agent的实时系统运行故障监控框架,旨在利用在多agent的协作构建运行故障监控系统来在系统运行当中验证系统是否满足时序逻辑描述的性质规约,并采用具体的算法进行故障定位和修复。  相似文献   

8.
在传统的软件可靠性增长G-O模型中,故障检测率和初始的故障总数是影响软件可靠性的2个重要因素.为了提高软件可靠性评估的可信性,考虑到在软件纠错的过程中可能会引入新的错误,把模型中潜在的故障总数和故障检测率看作随时间变化的函数,提出了改进的G-O模型,给出了解析方法,并将改进前后的G-O模型进行了对比,通过实例进行了验证...  相似文献   

9.
软件错误播种方法不仅可以用来评价软件的性能和研究软件错误的特性,而且还可通过播种错误为软件测试方法的评估提供必要的条件。考虑到白盒测试所针对的错误类型是程序代码级错误,为了方便错误播种,将程序代码错误分为计算型错误、域错误和程序接口错误,并针对这3类错误提供了一种改进的基于程序变异的软件错误播种方法。  相似文献   

10.
以研究对嵌入式系统鲁棒性进行评价和基于软件故障注入技术的嵌入式系统鲁棒性测试为目的。对嵌入式系统鲁棒性测试的相关概念以及软件故障注入技术原理进行了介绍,以Linux操作系统内核函数测试为例,通过对系统API参数的故障注入接口进行分析,提出基于GDB工具的软件故障注入方法来实现系统鲁棒性故障注入测试。完成了相应的Linux操作系统API接口故障注入测试实例并给出了测试结果。为嵌入式系统鲁棒性测试提供了更为直观、有效的方法。  相似文献   

11.
An experiment was conducted to evaluate an inter-procedural test adequacy criterion named Interface Mutation. Program SPACE, developed for the European Space Agency (ESA), was used in this experiment. The development record available for this program was used to find the faults uncovered during its development. Using this information the test process was reproduced starting with a version of SPACE containing several faults and then applying Interface Mutation. Thus we could evaluate the fault revealing effectiveness of Interface Mutation. Results from the experiment suggest that (a) the application of Interface Mutation favors the selection of fault revealing test cases when they exist and (b) Interface Mutation tends to select fault revealing test cases more efficiently than in the case where random selection is used.  相似文献   

12.
Regression testing is an important activity in the software life cycle, but it can also be very expensive. To reduce the cost of regression testing, software testers may prioritize their test cases so that those which are more important, by some measure, are run earlier in the regression testing process. One potential goal of test case prioritization techniques is to increase a test suite's rate of fault detection (how quickly, in a run of its test cases, that test suite can detect faults). Previous work has shown that prioritization can improve a test suite's rate of fault detection, but the assessment of prioritization techniques has been limited primarily to hand-seeded faults, largely due to the belief that such faults are more realistic than automatically generated (mutation) faults. A recent empirical study, however, suggests that mutation faults can be representative of real faults and that the use of hand-seeded faults can be problematic for the validity of empirical results focusing on fault detection. We have therefore designed and performed two controlled experiments assessing the ability of prioritization techniques to improve the rate of fault detection of test case prioritization techniques, measured relative to mutation faults. Our results show that prioritization can be effective relative to the faults considered, and they expose ways in which that effectiveness can vary with characteristics of faults and test suites. More importantly, a comparison of our results with those collected using hand-seeded faults reveals several implications for researchers performing empirical studies of test case prioritization techniques in particular and testing techniques in general  相似文献   

13.
Although numerous empirical studies have been conducted to measure the fault detection capability of software analysis methods, few studies have been conducted using programs of similar size and characteristics. Therefore, it is difficult to derive meaningful conclusions on the relative detection ability and cost‐effectiveness of various fault detection methods. In order to compare fault detection capability objectively, experiments must be conducted using the same set of programs to evaluate all methods and must involve participants who possess comparable levels of technical expertise. One such experiment was ‘Conflict1’, which compared voting, a testing method, self‐checks, code reading by stepwise refinement and data‐flow analysis methods on eight versions of a battle simulation program. Since an inspection method was not included in the comparison, the authors conducted a follow‐up experiment ‘Conflict2’, in which five of the eight versions from Conflict1 were subjected to Fagan inspection. Conflict2 examined not only the number and types of faults detected by each method, but also the cost‐effectiveness of each method, by comparing the average amount of effort expended in detecting faults. The primary findings of the Conflict2 experiment are the following. First, voting detected the largest number of faults, followed by the testing method, Fagan inspection, self‐checks, code reading and data‐flow analysis. Second, the voting, testing and inspection methods were largely complementary to each other in the types of faults detected. Third, inspection was far more cost‐effective than the testing method studied. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

14.
Two experimental comparisons of data flow and mutation testing are presented. These techniques are widely considered to be effective for unit-level software testing, but can only be analytically compared to a limited extent. We compare the techniques by evaluating the effectiveness of test data developed for each. We develop ten independent sets of test data for a number of programs: five to satisfy the mutation criterion and five to satisfy the all-uses data-flow criterion. These test sets are developed using automated tools, in a manner consistent with the way a test engineer might be expected to generate test data in practice. We use these test sets in two separate experiments. First we measure the effectiveness of the test data that was developed for one technique in terms of the other. Second, we investigate the ability of the test sets to find faults. We place a number of faults into each of our subject programs, and measure the number of faults that are detected by the test sets. Our results indicate that while both techniques are effective, mutation-adequate test sets are closer to satisfying the data flow criterion, and detect more faults.  相似文献   

15.
Identifying a finite test set that adequately captures the essential behaviour of a program such that all faults are identified is a well‐established problem. This is traditionally addressed with syntactic adequacy metrics (e.g. branch coverage), but these can be impractical and may be misleading even if they are satisfied. One intuitive notion of adequacy, which has been discussed in theoretical terms over the past three decades, is the idea of behavioural coverage: If it is possible to infer an accurate model of a system from its test executions, then the test set can be deemed to be adequate. Despite its intuitive basis, it has remained almost entirely in the theoretical domain because inferred models have been expected to be exact (generally an infeasible task) and have not allowed for any pragmatic interim measures of adequacy to guide test set generation. This paper presents a practical approach to incorporate behavioural coverage. Our BESTEST approach (1) enables the use of machine learning algorithms to augment standard syntactic testing approaches and (2) shows how search‐based testing techniques can be applied to generate test sets with respect to this criterion. An empirical study on a selection of Java units demonstrates that test sets with higher behavioural coverage significantly outperform current baseline test criteria in terms of detected faults. © 2015 The Authors. Software Testing, Verification and Reliability published by John Wiley & Sons, Ltd.  相似文献   

16.
为提高航天嵌入式软件的测试质量、确保航天型号任务的圆满完成,对航天嵌入式软件代码审查重要内容之一的代码逻辑分析进行了研究.通过对软件缺陷的机理、缺陷查找过程、缺陷暴露过程、以及缺陷引发后果的分析,结合多年软件测试工程实践经验的总结,提出了场景分析法、时序分析法、假想故障追源法等10种主要的代码逻辑分析方法.开展了代码逻辑分析方法的应用分析、代码审查与其它测试手段之间的对比分析,通过分析,给出了代码审查的工程适用性说明.研究成果已在航天型号软件第三方评测中全面推广应用,实践数据表明,应用效果良好,使代码审查的缺陷发现率由业界公认的30%~70%提升至90%以上.相关分析方法和分析思路对动态测试设计以及软件缺陷自动化检测工具的研发均具有一定的参考作用.  相似文献   

17.
潘正华 《计算机科学》2006,33(10):131-133
中介逻辑ML(MediumLogic)是近年提出的一种非经典逻辑。在ML的系统特征理论中已证明ML具有语义完全(完备)性。本文研究了ML的语法完全性,证明了如下结果:(1)ML中的中介命题逻辑系统MP及其扩张MP*是语法完全的,中介谓词逻辑系统MF与其扩张MF*,以及含有等词的中介谓词逻辑系统ME不是语法完全的。(2)一般地,如果一个协调的逻辑形式系统不是语法完全的,则它的任何协调的扩张系统也不是语法完全的。  相似文献   

18.
基于程序谱的错误定位技术由于其较高的定位效率已成为当前软件调试领域研究热点之一.这种技术通常根据测试覆盖信息计算程序语句发生错误的可疑度来进行错误定位.然而,这种技术会随着程序中错误数目的增多效率不断下降.鉴于此,提出了一种基于条件执行切片谱的多错误定位技术(conditioned execution slicing spectrum-based multiple fault localization, CESS-MFL),以提高多错误定位的效率.CESS-MFL技术首先根据输入变量的谓词条件构建错误相关条件执行切片的谱矩阵,然后依次计算错误相关条件执行切片中的元素(语句或语句块)的可疑度,并生成可疑度报告.实验验证了CESS-MFL技术比当前流行的基于程序谱的Tarantula技术、基于程序切片的Intersection技术、Union技术有更高的多错误定位效率,并且可在有效的时间和空间复杂度内完成.  相似文献   

19.
RELAY is a model of faults and failures that defines failure conditions, which describe test data for which execution will guarantee that a fault originates erroneous behavior that also transfers through computations and information flow until a failure is revealed. This model of fault detection provides a framework within which other testing criteria's capabilities can be evaluated. Three test data selection criteria that detect faults in six fault classes are analyzed. This analysis shows that none of these criteria is capable of guaranteeing detection for these fault classes and points out two major weaknesses of these criteria. The first weakness is that the criteria do not consider the potential unsatisfiability of their rules. Each criterion includes rules that are sufficient to cause potential failures for some fault classes, yet when such rules are unsatisfiable, many faults may remain undetected. Their second weakness is failure to integrate their proposed rules  相似文献   

20.
Code‐coverage‐based test data adequacy criteria typically treat all coverable code elements (such as statements, basic blocks or outcomes of decisions) as equal. In practice, however, the probability that a test case can expose a fault in a code element varies: some faults are more easily revealed than others. Thus, several researchers have suggested that if one could estimate the probability that a fault in a code element will cause a failure, one could use this estimate to determine the number of executions of a code element that are required to achieve a certain level of confidence in that element's correctness. This estimate, in turn, could be used to improve the fault‐detection effectiveness of test suites and help testers distribute testing resources more effectively. This conjecture is intriguing; however, like many such conjectures it has never been directly examined empirically. If empirical evidence were to support this conjecture, it would motivate further research into methodologies for obtaining fault‐exposure‐potential estimates and incorporating them into test data adequacy criteria. This paper reports the results of experiments conducted to investigate the effects of incorporating an estimate of fault‐exposure probability into the statement coverage test data adequacy criterion. The results of these experiments, however, ran contrary to the conjectures of previous researchers. Although incorporation of the estimates did produce statistically significant increases in the fault‐detection effectiveness of test suites, these increases were quite small, suggesting that the approach might not be able to produce the gains hoped for and might not be worth the cost of its employment. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号