首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
测试用例集的缺陷检测有效性指测试集能够在多大程度上检测出软件中存在的缺陷. 如何评价测试集的缺陷检测有效性是一个重要问题. 覆盖率和变异得分是两个最重要和最广泛使用的测试集有效性度量. 为量化测试集的缺陷检测能力, 研究人员对测试集有效性评价进行了大量研究并且取得了较大的进展. 与此同时, 现有研究存在不一致的结论, 该领域依然存在一些亟待解决的挑战. 对多年来国内外学者在测试集有效性评价领域的研究成果进行系统性的梳理和总结. 首先, 阐述测试集有效性评价研究中的问题. 然后, 介绍并分析基于覆盖率和基于变异得分的测试集有效性的评价以及介绍测试集有效性评价在测试集优化中的应用. 最后, 指出测试集有效性评价研究中面临的挑战并给出建议的研究方向.  相似文献   

2.
We introduce a new performance metric, called load balancing factor (LBF), to assist programmers when evaluating different tuning alternatives. The LBF metric differs from traditional performance metrics since it is intended to measure the performance implications of a specific tuning alternative rather than quantifying where time is spent in the current version of the program. A second unique aspect of the metric is that it provides guidance about moving work within a distributed or parallel program rather than reducing it. A variation of the LBF metric can also be used to predict the performance impact of changing the underlying network. The LBF metric is computed incrementally and online during the execution of the program to be tuned. We also present a case study that shows that our metric can accurately predict the actual performance gains for a test suite of six programs  相似文献   

3.
Mutation testing is a testing technique that has been applied successfully to several programming languages. Despite its benefits for software testing, the high computational cost of mutation testing has kept it from being widely used. Several refinements have been proposed to reduce its cost by reducing the number of generated mutants; one of those is evolutionary mutation testing (EMT). Evolutionary mutation testing aims at generating a reduced set of mutants with an evolutionary algorithm, which searches for potentially equivalent and difficult to kill mutants that help improve the test suite. Evolutionary mutation testing has been evaluated in two contexts so far, ie, web service compositions and object-oriented C++ programmes. This study explores its performance when applied to event processing language queries of various domains. This study also considers the impact of the test data, since a lack of events or the need to have specific values in them can hinder testing. The effectiveness of evolutionary mutation testing with the original test data generators and the new internet of things test event generator tool is compared in multiple case studies.  相似文献   

4.
In corrective maintenance, modified software is regression tested using selected test cases in order to ensure that the modifications have not caused adverse effects. This activity of selective regression testing involves regression test selection, which refers to selecting test cases from the previously run test suite, and test-coverage identification. In this paper, we propose three test-selection methods and two coverage identification metrics. The three methods aim to reduce the number of selected test cases for retesting the modified software. The first method, referred to as modification-based reduction version 1 (MBR1), selects a reduced number of test cases based on the modification made and its effects in the software. The second method, referred to as modification-based reduction version 2 (MBR2) improves MBR1 by further omitting tests that do not cover the modification. The third method, referred to as precise reduction (PR), reduces the number of test cases selected by omitting non-modification-revealing tests from the initial test suite. The two coverage metrics are McCabe-based regression test metrics, which are referred to as the Reachability regression Test selection McCabe-based metric (RTM), and data-flow Slices regression Test McCabe-based metric (STM). These metrics aim to assist the regression tester in monitoring test-coverage adequacy, reveal any shortage or redundancy in the test suite, and assist in identifying, where additional tests may be required for regression testing.We empirically compare MBR1, MBR2, and PR with three reduction and precision-oriented methods on 60 test-problems. The results show that PR selects the least number of test cases most of the time and omits non-modification-revealing tests. We also demonstrate the applicability of our proposed methods to object-oriented regression testing at the class level. Further, we illustrate typical application of the RTM and STM metrics using the 60 test-problems and two coverage-oriented selective regression-testing methods.  相似文献   

5.
Software metrics are computed for the purpose of evaluating certain characteristics of the software developed. A Fortran static source code analyzer, FORTRANAL, was developed to study 31 metrics, including a new hybrid metric introduced in this paper, and applied to a database of 255 programs, all of which were student assignments. Comparisons among these metrics are performed. Their cross-correlation confirms the internal consistency of some of these metrics which belong to the same class. To remedy the incompleteness of most of these metrics, the proposed metric incorporates context sensitivity to structural attributes extracted from a flow graph. It is also concluded that many volume metrics have similar performance while some control metrics surprisingly correlate well with typical volume metrics in the test samples used. A flexible class of hybrid metric can incorporate both volume and control attributes in assessing software complexity.  相似文献   

6.
Mutation testing has historically been used to assess the fault-finding effectiveness of a test suite or other verification technique. Mutation analysis, rather, entails augmenting a test suite to detect all killable mutants. Concerns about the time efficiency of mutation analysis may prohibit its widespread, practical use. The goal of our research is to assess the effectiveness of the mutation analysis process when used by software testers to augment a test suite to obtain higher statement coverage scores. We conducted two empirical studies and have shown that mutation analysis can be used by software testers to effectively produce new test cases and to improve statement coverage scores in a feasible amount of time. Additionally, we find that our user study participants view mutation analysis as an effective but relatively expensive technique for writing new test cases. Finally, we have shown that the choice of mutation tool and operator set can play an important role in determining how efficient mutation analysis is for producing new test cases.  相似文献   

7.
Automated program repair is increasingly gaining traction, due to its potential to reduce debugging cost greatly. The feasibility of automated program repair has been shown in a number of works, and the research focus is gradually shifting toward the quality of generated patches. One promising direction is to control the quality of generated patches by controlling the quality of test-suites used for automated program repair. In this paper, we ask the following research question: “Can traditional test-suite metrics proposed for the purpose of software testing also be used for the purpose of automated program repair?” We empirically investigate whether traditional test-suite metrics such as statement/branch coverage and mutation score are effective in controlling the reliability of generated repairs (the likelihood that repairs cause regression errors). We conduct the largest-scale experiments of this kind to date with real-world software, and for the first time perform a correlation study between various test-suite metrics and the reliability of generated repairs. Our results show that in general, with the increase of traditional test suite metrics, the reliability of repairs tend to increase. In particular, such a trend is most strongly observed in statement coverage. Our results imply that the traditional test suite metrics proposed for software testing can also be used for automated program repair to improve the reliability of repairs.  相似文献   

8.
Conformance testing procedures for generating tests from the finite state model representation of Role-Based Access Control (RBAC) policies are proposed and evaluated. A test suite generated using one of these procedures has excellent fault detection ability but is astronomically large. Two approaches to reduce the size of the generated test suite were investigated. One is based on a set of six heuristics and the other directly generates a test suite from the finite state model using random selection of paths in the policy model. Empirical studies revealed that the second approach to test suite generation, combined with one or more heuristics, is most effective in the detection of both first-order mutation and malicious faults and generates a significantly smaller test suite than the one generated directly from the finite state models.  相似文献   

9.
余伟  江艳  张凡 《控制理论与应用》2022,39(12):2293-2301
现有基于解析模型的故障诊断方法, 大都是在欧氏距离下的残差度量, 难以有效解决闭环系统的故障诊断. 本文从鲁棒控制理论中的系统间隙度量这一新的视角出发, 利用其特别适合于闭环性能度量这一根本特性, 采用互质分解技术建立包含不确定性和扰动的系统数学模型; 基于间隙度量技术, 给出故障检测和故障分级分类方法; 最后在现役飞机、电动汽车和工业伺服驱动等领域广泛采用的电机双闭环系统中, 通过与传统方法对比, 在数值实验中验证本方法的有效性.  相似文献   

10.
Eyerman  S. Eeckhout  L. 《Micro, IEEE》2008,28(3):42-53
Assessing the performance of multiprogram workloads running on multithreaded hardware is difficult because it involves a balance between single-program performance and overall system performance. This article argues for developing multiprogram performance metrics in a top-down fashion starting from system-level objectives. The authors propose two performance metrics: average normalized turnaround time, a user-oriented metric, and system throughput, a system-oriented metric.  相似文献   

11.

When custom modeling tools are used for designing complex safety-critical systems (e.g., critical cyber-physical systems), the tools themselves need to be validated by systematic testing to prevent tool-specific bugs reaching the system. Testing of such modeling tools relies upon an automatically generated set of models as a test suite. While many software testing practices recommend that this test suite should be diverse, model diversity has not been studied systematically for graph models. In the paper, we propose different diversity metrics for models by generalizing and exploiting neighborhood and predicate shapes as abstraction. We evaluate such shape-based diversity metrics using various distance functions in the context of mutation testing of graph constraints and access policies for two separate industrial DSLs. Furthermore, we evaluate the quality (i.e., bug detection capability) of different (random and consistent) model generation techniques for mutation testing purposes.

  相似文献   

12.
Ontology languages such as OWL are being widely used as the Semantic Web movement gains momentum. With the proliferation of the Semantic Web, more and more large-scale ontologies are being developed in real-world applications to represent and integrate knowledge and data. There is an increasing need for measuring the complexity of these ontologies in order for people to better understand, maintain, reuse and integrate them. In this paper, inspired by the concept of software metrics, we propose a suite of ontology metrics, at both the ontology-level and class-level, to measure the design complexity of ontologies. The proposed metrics are analytically evaluated against Weyuker’s criteria. We have also performed empirical analysis on public domain ontologies to show the characteristics and usefulness of the metrics. We point out possible applications of the proposed metrics to ontology quality control. We believe that the proposed metric suite is useful for managing ontology development projects.  相似文献   

13.
There are many situations in quality control of manufacturing processes in which the quality of a process is characterized by the spatial distribution of certain particles in the product, and the more uniform the particle distribution is, the better the quality is. To realize quality control and guide process improvement efforts, the degree of spatial uniformity of particle distributions needs to be assessed. On the other hand, many quantitative metrics have been developed in areas outside manufacturing for measuring uniformity of point patterns, which can be applied for this purpose. However, critical issues exist in applying existing metrics for quality control relating to which metrics to choose and how to use them in specific situations. To provide general guidelines on these issues, this research identifies popular uniformity metrics scattered in different areas and compares their performance in detecting nonuniform particle distributions under various practical scenarios through a comprehensive numerical study. Effects of different factors on the performance of the metrics are revealed and the best metric is found. The use and effectiveness of the selected metric is also demonstrated in a case study where it is applied to data from emerging material fabrication processes in nanomanufacturing and biomanufacturing.  相似文献   

14.
随着区块链技术的兴起,智能合约安全问题被越来越多的研究者和企业重视,目前已有一些针对智能合约缺陷检测技术的研究.软件缺陷预测技术是软件缺陷检测技术的有效补充,能够优化测试资源分配,提高软件测试效率.然而,目前还没有针对智能合约的软件缺陷预测研究.针对这一问题,提出了面向Solidity智能合约的缺陷预测方法.首先,设计了一组针对Solidity智能合约特有的变量、函数、结构和Solidity语言特性的度量元集(smart contract-Solidity, SC-Sol度量元集),并将其与重点考虑面向对象特征的度量元集(code complexity and features of object-oriented program, COOP度量元集)组合为COOP-SC-Sol度量元集.然后,从Solidity智能合约代码中提取相关度量元信息,并结合缺陷检测结果,构建Solidity智能合约缺陷数据集.在此基础上,应用了7种回归模型和6种分类模型进行Solidity智能合约的缺陷预测,以验证不同度量元集和不同模型在缺陷数量和倾向性预测上的性能差异.实验结果表明,相对于COOP度量元集...  相似文献   

15.
Existing saliency detection evaluation metrics often produce inconsistent evaluation results. Because of the widespread application of image saliency detection, we propose a meta-metric to evaluate the performance of these metrics based on the preference of an application that uses saliency maps as weighting maps. This study uses content-based image retrieval (CBIR) as the representative application. First, we perform CBIR using image features extracted from deep convolutional layers of convolutional neural networks as well as saliency maps computed by various saliency detection algorithms as the weighting maps over queries. Second, we establish the preference order of the saliency detection algorithms in the CBIR application by sorting the mean average precision. Third, we determine the preference order of these algorithms using existing saliency detection evaluation metrics. Finally, our meta-metric evaluates these metrics by correlating the preference order in the CBIR application with that determined by each evaluation metric. Experiments on three publicly available datasets show that, of 24 evaluation metrics, the traditional metric: area under receiver operating characteristic curve is the best metric for a CBIR application.  相似文献   

16.
The empirical assessment of test techniques plays an important role in software testing research. One common practice is to seed faults in subject software, either manually or by using a program that generates all possible mutants based on a set of mutation operators. The latter allows the systematic, repeatable seeding of large numbers of faults, thus facilitating the statistical analysis of fault detection effectiveness of test suites; however, we do not know whether empirical results obtained this way lead to valid, representative conclusions. Focusing on four common control and data flow criteria (block, decision, C-use, and P-use), this paper investigates this important issue based on a middle size industrial program with a comprehensive pool of test cases and known faults. Based on the data available thus far, the results are very consistent across the investigated criteria as they show that the use of mutation operators is yielding trustworthy results: generated mutants can be used to predict the detection effectiveness of real faults. Applying such a mutation analysis, we then investigate the relative cost and effectiveness of the above-mentioned criteria by revisiting fundamental questions regarding the relationships between fault detection, test suite size, and control/data flow coverage. Although such questions have been partially investigated in previous studies, we can use a large number of mutants, which helps decrease the impact of random variation in our analysis and allows us to use a different analysis approach. Our results are then; compared with published studies, plausible reasons for the differences are provided, and the research leads us to suggest a way to tune the mutation analysis process to possible differences in fault detection probabilities in a specific environment  相似文献   

17.
Prioritizing test cases with string distances   总被引:1,自引:0,他引:1  
Test case prioritisation aims at finding an ordering which enhances a certain property of an ordered test suite. Traditional techniques rely on the availability of code or a specification of the program under test. We propose to use string distances on the text of test cases for their comparison and elaborate a prioritisation algorithm. Such a prioritisation does not require code or a specification and can be useful for initial testing and in cases when code is difficult to instrument. In this paper, we also report on experiments performed on the “Siemens Test Suite”, where the proposed prioritisation technique was compared with random permutations and four classical string distance metrics were evaluated. The obtained results, confirmed by a statistical analysis, indicate that prioritisation based on string distances is more efficient in finding defects than random ordering of the test suite: the test suites prioritized using string distances are more efficient in detecting the strongest mutants, and, on average, have a better APFD than randomly ordered test suites. The results suggest that string distances can be used for prioritisation purposes, and Manhattan distance could be the best choice.  相似文献   

18.
本文提出了一套面向Agent软件度量的新指标:知识量、学习能力、反应时间、Agent总数、依赖度、被依赖度、交互数、总交互数、社会理性和智商。并在此基础上开发了一个度量工具原型:度量AgentMA。  相似文献   

19.
Scalability is an important performance metric of parallel computing, but the traditional scalability metrics only try to reflect the scalability for parallel computing from one side, which makes it difficult to fully measure its overall performance. This paper studies scalability metrics intensively and completely. From lots of performance parameters of parallel computing, a group of key ones is chosen and normalized. Further the area of Kiviat graph is used to characterize the overall performance of parallel computing. Thereby a novel scalability metric about iso-area of performance for parallel computing is proposed and the relationship between the new metric and the traditional ones is analyzed. Finally the novel metric is applied to address the scalability of the matrix multiplication Cannon’s algorithm under LogP model. The proposed metric is significant to improve parallel computing architecture and to tune parallel algorithm design.  相似文献   

20.
Mathematical Morphology (MM) is a general method for image processing based on set theory. The two basic morphological operators are dilation and erosion. From these, several non linear filters have been developed usually with polynomial complexity, and this because the two basic operators depend strongly on the definition of the structural element. Most efforts to improve the algorithm's speed for each operator are based on structural element decomposition and/or efficient codification.A new framework and a theoretical basis toward the construction of fast morphological operators (of zero complexity) for an infinite (countable) family of regular metric spaces are presented in work. The framework is completely defined by the three axioms of metric. The theoretical basis here developed points out properties of some metric spaces and relationships between metric spaces in the same family, just in terms of the properties of the four basic metrics stated in this work. Concepts such as bounds, neighborhoods and contours are also related by the same framework.The presented results, are general in the sense that they cover the most commonly used metrics such as the chamfer, the city block and the chess board metrics. Generalizations and new results related with distances and distance transforms, which in turn are used to develop the morphologic operations in constant time, in contrast with the polynomial time algorithms are also given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号