首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ContextScientific software plays an important role in critical decision making, for example making weather predictions based on climate models, and computation of evidence for research publications. Recently, scientists have had to retract publications due to errors caused by software faults. Systematic testing can identify such faults in code.ObjectiveThis study aims to identify specific challenges, proposed solutions, and unsolved problems faced when testing scientific software.MethodWe conducted a systematic literature survey to identify and analyze relevant literature. We identified 62 studies that provided relevant information about testing scientific software.ResultsWe found that challenges faced when testing scientific software fall into two main categories: (1) testing challenges that occur due to characteristics of scientific software such as oracle problems and (2) testing challenges that occur due to cultural differences between scientists and the software engineering community such as viewing the code and the model that it implements as inseparable entities. In addition, we identified methods to potentially overcome these challenges and their limitations. Finally we describe unsolved challenges and how software engineering researchers and practitioners can help to overcome them.ConclusionsScientific software presents special challenges for testing. Specifically, cultural differences between scientist developers and software engineers, along with the characteristics of the scientific software make testing more difficult. Existing techniques such as code clone detection can help to improve the testing process. Software engineers should consider special challenges posed by scientific software such as oracle problems when developing testing techniques.  相似文献   

2.
Abstract

In this paper, we analyze a method for detecting software piracy. A metamorphic generator is used to create morphed copies of a base piece of software. A hidden Markov model is trained on the opcode sequences extracted from these morphed copies and the resulting trained model is used to score suspect software to determine its similarity to the base software. A high score indicates that the suspect software may be a modified version of the base software, suggesting that further investigation is warranted. In contrast, a low score indicates that the suspect software differs significantly from the base software. We show that our approach is robust, in the sense that the base software must be extensively modified before it is not detected.  相似文献   

3.
ContextComplexity measures provide us some information about software artifacts. A measure of the difficulty of testing a piece of code could be very useful to take control about the test phase.ObjectiveThe aim in this paper is the definition of a new measure of the difficulty for a computer to generate test cases, we call it Branch Coverage Expectation (BCE). We also analyze the most common complexity measures and the most important features of a program. With this analysis we are trying to discover whether there exists a relationship between them and the code coverage of an automatically generated test suite.MethodThe definition of this measure is based on a Markov model of the program. This model is used not only to compute the BCE, but also to provide an estimation of the number of test cases needed to reach a given coverage level in the program. In order to check our proposal, we perform a theoretical validation and we carry out an empirical validation study using 2600 test programs.ResultsThe results show that the previously existing measures are not so useful to estimate the difficulty of testing a program, because they are not highly correlated with the code coverage. Our proposed measure is much more correlated with the code coverage than the existing complexity measures.ConclusionThe high correlation of our measure with the code coverage suggests that the BCE measure is a very promising way of measuring the difficulty to automatically test a program. Our proposed measure is useful for predicting the behavior of an automatic test case generator.  相似文献   

4.
5.
6.
ContextSome recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI). Latent Dirichlet allocation (LDA) is a generative statistical model that has significant advantages, in modularity and extensibility, over both LSI and probabilistic LSI (pLSI). Moreover, LDA has been shown effective in topic model based information retrieval. In this paper, we present a static LDA-based technique for automatic bug localization and evaluate its effectiveness.ObjectiveWe evaluate the accuracy and scalability of the LDA-based technique and investigate whether it is suitable for use with open-source software systems of varying size, including those developed using agile methods.MethodWe present five case studies designed to determine the accuracy and scalability of the LDA-based technique, as well as its relationships to software system size and to source code stability. The studies examine over 300 bugs across more than 25 iterations of three software systems.ResultsThe results of the studies show that the LDA-based technique maintains sufficient accuracy across all bugs in a single iteration of a software system and is scalable to a large number of bugs across multiple revisions of two software systems. The results of the studies also indicate that the accuracy of the LDA-based technique is not affected by the size of the subject software system or by the stability of its source code base.ConclusionWe conclude that an effective static technique for automatic bug localization can be built around LDA. We also conclude that there is no significant relationship between the accuracy of the LDA-based technique and the size of the subject software system or the stability of its source code base. Thus, the LDA-based technique is widely applicable.  相似文献   

7.
ContextRecently, network measures have been proposed to predict fault-prone modules. Leveraging the dependency relationships between software entities, network measures describe the structural features of software systems. However, there is no consensus about their effectiveness for fault-proneness prediction. Specifically, the predictive ability of network measures in effort-aware context has not been addressed.ObjectiveWe aim to provide a comprehensive evaluation on the predictive effectiveness of network measures with the effort needed to inspect the code taken into consideration.MethodWe first constructed software source code networks of 11 open-source projects by extracting the data and call dependencies between modules. We then employed univariate logistic regression to investigate how each single network measure was correlated with fault-proneness. Finally, we built multivariate prediction models to examine the usefulness of network measures under three prediction settings: cross-validation, across-release, and inter-project predictions. In particular, we used the effort-aware performance indicators to compare their predictive ability against the commonly used code metrics in both ranking and classification scenarios.ResultsBased on the 11 open-source software systems, our results show that: (1) most network measures are significantly positively related to fault-proneness; (2) the performance of network measures varies under different prediction settings; (3) network measures have inconsistent effects on various projects.ConclusionNetwork measures are of practical value in the context of effort-aware fault-proneness prediction, but researchers and practitioners should be careful of choosing whether and when to use network measures in practice.  相似文献   

8.
ContextBuilding defect prediction models in large organizations has many challenges due to limited resources and tight schedules in the software development lifecycle. It is not easy to collect data, utilize any type of algorithm and build a permanent model at once. We have conducted a study in a large telecommunications company in Turkey to employ a software measurement program and to predict pre-release defects. Based on our prior publication, we have shared our experience in terms of the project steps (i.e. challenges and opportunities). We have further introduced new techniques that improve our earlier results.ObjectiveIn our previous work, we have built similar predictors using data representative for US software development. Our task here was to check if those predictors were specific solely to US organizations or to a broader class of software.MethodWe have presented our approach and results in the form of an experience report. Specifically, we have made use of different techniques for improving the information content of the software data and the performance of a Naïve Bayes classifier in the prediction model that is locally tuned for the company. We have increased the information content of the software data by using module dependency data and improved the performance by adjusting the hyper-parameter (decision threshold) of the Naïve Bayes classifier. We have reported and discussed our results in terms of defect detection rates and false alarms. We also carried out a cost–benefit analysis to show that our approach can be efficiently put into practice.ResultsOur general result is that general defect predictors, which exist across a wide range of software (in both US and Turkish organizations), are present. Our specific results indicate that concerning the organization subject to this study, the use of version history information along with code metrics decreased false alarms by 22%, the use of dependencies between modules further reduced false alarms by 8%, and the decision threshold optimization for the Naïve Bayes classifier using code metrics and version history information further improved false alarms by 30% in comparison to a prediction using only code metrics and a default decision threshold.ConclusionImplementing statistical techniques and machine learning on a real life scenario is a difficult yet possible task. Using simple statistical and algorithmic techniques produces an average detection rate of 88%. Although using dependency data improves our results, it is difficult to collect and analyze such data in general. Therefore, we would recommend optimizing the hyper-parameter of the proposed technique, Naïve Bayes, to calibrate the defect prediction model rather than employing more complex classifiers. We also recommend that researchers who explore statistical and algorithmic methods for defect prediction should spend less time on their algorithms and more time on studying the pragmatic considerations of large organizations.  相似文献   

9.
Source code management systems (such as git) record changes to code repositories of Open-Source Software (OSS) projects. The metadata about a change includes a change message to record the intention of the change. Classification of changes, based on change messages, into different change types has been explored in the past to understand the evolution of software systems from the perspective of change size and change density only. However, software evolution analysis based on change classification with a focus on change evolution patterns is still an open research problem. This study examines change messages of 106 OSS projects, as recorded in the git repository, to explore their evolutionary patterns with respect to the types of changes performed over time. An automated keyword-based classifier technique is applied to the change messages to categorize the changes into various types (corrective, adaptive, perfective, preventive, and enhancement). Cluster analysis helps to uncover distinct change patterns that each change type follows. We identify three categories of 106 projects for each change type: high activity, moderate activity, and low activity. Evolutionary behavior is different for projects of different categories. The projects with high and moderate activity receive maximum changes during 76–81 months of the project lifetime. The project attributes such as the number of committers, number of files changed, and total number of commits seem to contribute the most to the change activity of the projects. The statistical findings show that the change activity of a project is related to the number of contributors, amount of work done, and total commits of the projects irrespective of the change type. Further, we explored languages and domains of projects to correlate change types with domains and languages of the projects. The statistical analysis indicates that there is no significant and strong relation of change types with domains and languages of the 106 projects.  相似文献   

10.
ContextReusing software by means of copy and paste is a frequent activity in software development. The duplicated code is known as a software clone and the activity is known as code cloning. Software clones may lead to bug propagation and serious maintenance problems.ObjectiveThis study reports an extensive systematic literature review of software clones in general and software clone detection in particular.MethodWe used the standard systematic literature review method based on a comprehensive set of 213 articles from a total of 2039 articles published in 11 leading journals and 37 premier conferences and workshops.ResultsExisting literature about software clones is classified broadly into different categories. The importance of semantic clone detection and model based clone detection led to different classifications. Empirical evaluation of clone detection tools/techniques is presented. Clone management, its benefits and cross cutting nature is reported. Number of studies pertaining to nine different types of clones is reported. Thirteen intermediate representations and 24 match detection techniques are reported.ConclusionWe call for an increased awareness of the potential benefits of software clone management, and identify the need to develop semantic and model clone detection techniques. Recommendations are given for future research.  相似文献   

11.
Software piracy has been known as unauthorised reconstruction or illegal redistribution of a licenced software. Detecting pirated from base software is a major concern since pirated software can lead to significant financial losses as well as serious security vulnerabilities. To detect software piracy, we have recently proposed Metamorphic Analysis for Automatic Software Piracy Detection (MetaSPD) with a proof-of-concept evaluation. The core idea of MetaSPD was inspired from metamorphic malware detection due to its similarity of software piracy detection. MetaSPD works primarily based on mining the opcode graph of the base software to extract micro-signatures. Then, it leverages a classifier model to decide whether a given suspicious file is a pirated version of the base software. This paper extends our prior work in several respects. First, we present a retrospective appraisal of the main literature aiming at laying bare the status quo of software piracy detection and arguments on the current problems of the field to motivate our work. We then elaborate on MetaSPD itself and the constituent components. We provide two extensive experiments to evident the effectiveness of MetaSPD. The experiments have been carried out on two different datasets. Each dataset comprises 1300 morphed variants of the respective base software that act as pirated versions of that software. We conducted our experiments using three different classifiers. The paper is also enriched with a detailed discussion of the different properties and concerns of MetaSPD. The results corroborate that an attacker, who is using a pirated version of the given software, can hardly hide illegal usage of the software even by applying superabundant obfuscations to the code.  相似文献   

12.
ContextEnterprise software systems (e.g., enterprise resource planning software) are often deployed in different contexts (e.g., different organizations or different business units or branches of one organization). However, even though organizations, business units or branches have the same or similar business goals, they may differ in how they achieve these goals. Thus, many enterprise software systems are subject to variability and adapted depending on the context in which they are used.ObjectiveOur goal is to provide a snapshot of variability in large scale enterprise software systems. We aim at understanding the types of variability that occur in large industrial enterprise software systems. Furthermore, we aim at identifying how variability is handled in such systems.MethodWe performed an exploratory case study in two large software organizations, involving two large enterprise software systems. Data were collected through interviews and document analysis. Data were analyzed following a grounded theory approach.ResultsWe identified seven types of variability (e.g., functionality, infrastructure) and eight mechanisms to handle variability (e.g., add-ons, code switches).ConclusionsWe provide generic types for classifying variability in enterprise software systems, and reusable mechanisms for handling such variability. Some variability types and handling mechanisms for enterprise software systems found in the real world extend existing concepts and theories. Others confirm findings from previous research literature on variability in software in general and are therefore not specific to enterprise software systems. Our findings also offer a theoretical foundation for describing variability handling in practice. Future work needs to provide more evaluations of the theoretical foundations, and refine variability handling mechanisms into more detailed practices.  相似文献   

13.
14.
为提高航天嵌入式软件的测试质量、确保航天型号任务的圆满完成,对航天嵌入式软件代码审查重要内容之一的代码逻辑分析进行了研究.通过对软件缺陷的机理、缺陷查找过程、缺陷暴露过程、以及缺陷引发后果的分析,结合多年软件测试工程实践经验的总结,提出了场景分析法、时序分析法、假想故障追源法等10种主要的代码逻辑分析方法.开展了代码逻辑分析方法的应用分析、代码审查与其它测试手段之间的对比分析,通过分析,给出了代码审查的工程适用性说明.研究成果已在航天型号软件第三方评测中全面推广应用,实践数据表明,应用效果良好,使代码审查的缺陷发现率由业界公认的30%~70%提升至90%以上.相关分析方法和分析思路对动态测试设计以及软件缺陷自动化检测工具的研发均具有一定的参考作用.  相似文献   

15.
ContextIn software, there are the error cases that are anticipated at specification and design time, those encountered at development and testing time, and those that were never anticipated before happening in production. Is it possible to learn from the anticipated errors during design to analyze and improve the resilience against the unanticipated ones in production?ObjectiveIn this paper, we aim at analyzing and improving how software handles unanticipated exceptions. The first objective is to set up contracts about exception handling and a way to assess them automatically. The second one is to improve the resilience capabilities of software by transforming the source code.MethodWe devise an algorithm, called short-circuit testing, which injects exceptions during test suite execution so as to simulate unanticipated errors. It is a kind of fault-injection techniques dedicated to exception-handling. This algorithm collects data that is used for verifying two formal contracts that capture two resilience properties w.r.t. exceptions: the source-independence and pure-resilience contracts. Then we propose a code modification technique, called “catch-stretching” which allows error-recovery code (of the form of catch blocks) to be more resilient.ResultsOur evaluation is performed on 9 open-source software applications and consists in analyzing 241 catch blocks executed during test suite execution. Our results show that 101/214 of them (47%) expose resilience properties as defined by our exception contracts and that 84/214 of them (39%) can be transformed to be more resilient.ConclusionOur work shows that it is possible to reason on software resilience by injecting exceptions during test suite execution. The collected information allows us to apply one source code transformation that improves the resilience against unanticipated exceptions. This works best if the test suite exercises the exceptional programming language constructs in many different scenarios.  相似文献   

16.
ContextSoftware documentation is an integral part of any software development process. However, software practitioners are often concerned about the value, degree of usage and usefulness of documentation during development and maintenance.ObjectiveMotivated by the needs of NovAtel Inc. (NovAtel), a world-leading company developing software systems in support of global navigation satellite systems, and based on the results of a former systematic mapping study, we aimed at better understanding of the usage and the usefulness of various technical documents during software development and maintenance.MethodWe utilized the results of a former systematic mapping study and performed an industrial case study at NovAtel. From the joint definition of the analysis goals, the research method incorporates qualitative and quantitative analysis of 55 documents (design, test and process related) and 1630 of their revisions. In addition, we conducted a survey on the usage and usefulness of documents. A total of 25 staff members from the industrial partner, all having a medium to high level of experience, participated in the survey.ResultsIn the context of the case study, a number of findings were derived. They include that (1) technical documentation was consulted least frequently for maintenance purpose and most frequently as an information source for development, (2) source code was considered most frequently as the preferred information source during software maintenance, (3) there is no significant difference between the usage of various documentation types during both development and maintenance, and (4) initial hypotheses stating that up-to-date information, accuracy and preciseness have the highest impact on usefulness of technical documentation.ConclusionsIt is concluded that the usage of documentation differs for various purposes and it depends on the type of the information needs as well as the tasks to be completed (e.g., development and maintenance). The results have been confirmed to be helpful for the company under study, and the firm is currently implementing some of the recommendations given.  相似文献   

17.
ContextTesting and verification of automotive embedded software is a major challenge. Software production in automotive domain comprises three stages: Developing automotive functions as Simulink models, generating code from the models, and deploying the resulting code on hardware devices. Automotive software artifacts are subject to three rounds of testing corresponding to the three production stages: Model-in-the-Loop (MiL), Software-in-the-Loop (SiL) and Hardware-in-the-Loop (HiL) testing.ObjectiveWe study testing of continuous controllers at the Model-in-Loop (MiL) level where both the controller and the environment are represented by models and connected in a closed loop system. These controllers make up a large part of automotive functions, and monitor and control the operating conditions of physical devices.MethodWe identify a set of requirements characterizing the behavior of continuous controllers, and develop a search-based technique based on random search, adaptive random search, hill climbing and simulated annealing algorithms to automatically identify worst-case test scenarios which are utilized to generate test cases for these requirements.ResultsWe evaluated our approach by applying it to an industrial automotive controller (with 443 Simulink blocks) and to a publicly available controller (with 21 Simulink blocks). Our experience shows that automatically generated test cases lead to MiL level simulations indicating potential violations of the system requirements. Further, not only does our approach generate significantly better test cases faster than random test case generation, but it also achieves better results than test scenarios devised by domain experts. Finally, our generated test cases uncover discrepancies between environment models and the real world when they are applied at the Hardware-in-the-Loop (HiL) level.ConclusionWe propose an automated approach to MiL testing of continuous controllers using search. The approach is implemented in a tool and has been successfully applied to a real case study from the automotive domain.  相似文献   

18.
ContextIn the era of globally-distributed software engineering, the practice of global software testing (GST) has witnessed increasing adoption. Although there have been ethnographic studies of the development aspects of global software engineering, there have been fewer studies of GST, which, to succeed, can require dealing with unique challenges.ObjectiveTo address this limitation of existing studies, we conducted, and in this paper, report the findings of, a study of a vendor organization involved in one kind of GST practice: outsourced, offshored software testing.MethodWe conducted an ethnographically-informed study of three vendor-side testing teams over a period of 2 months. We used methods, such as interviews and participant observations, to collect the data and the thematic-analysis approach to analyze the data.FindingsOur findings describe how the participant test engineers perceive software testing and deadline pressures, the challenges that they encounter, and the strategies that they use for coping with the challenges. The findings reveal several interesting insights. First, motivation and appreciation play an important role for our participants in ensuring that high-quality testing is performed. Second, intermediate onshore teams increase the degree of pressure experienced by the participant test engineers. Third, vendor team participants perceive productivity differently from their client teams, which results in unproductive-productivity experiences. Lastly, participants encounter quality-dilemma situations for various reasons.ConclusionThe study findings suggest the need for (1) appreciating test engineers’ efforts, (2) investigating the team structure’s influence on pressure and the GST practice, (3) understanding culture’s influence on other aspects of GST, and (4) identifying and addressing quality-dilemma situations.  相似文献   

19.
In this paper we address the important problem of instruction fetch for future wide issue superscalar processors. Our approach focuses on understanding the interaction between software and hardware techniques targeting an increase in the instruction fetch bandwidth. That is the objective, for instance, of the Hardware Trace Cache (HTC). We design a profile based code reordering technique which targets a maximization of the sequentiality of instructions, while still trying to minimize instruction cache misses. We call our software approach, Software Trace Cache (STC). We evaluate our software approach, and then compare it with the HTC and the combination of both techniques. Our results on PostgreSQL show that for large codes with few loops and deterministic execution sequences the STC offers better results than a HTC. Also, both the software and hardware approaches combine well to obtain improved results.  相似文献   

20.
曹英魁  孙泽宇  邹艳珍  谢冰 《软件学报》2021,32(4):1006-1022
在开发过程中,开发人员在进行缺陷修复、版本更新时,常常需要修改多处相似的代码.如何进行自动代码修改已成为软件工程领域的热点研究问题.一种行之有效的方式是:给定一组代码修改示例,通过抽取其中的代码修改模式,辅助相似代码进行自动转换.在现有工作中,基于深度学习的方法取得了一定进展,但在捕获代码间的长程信息依赖关系时,效果不...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号