首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Inconsistent identifiers make it difficult for developers to understand source code. In particular, large software systems written by several developers can be vulnerable to identifier inconsistency. Unfortunately, it is not easy to detect inconsistent identifiers that are already used in source code. Although several techniques have been proposed to address this issue, many of these techniques can result in false alarms since such techniques do not accept domain words and idiom identifiers that are widely used in programming practice. This paper proposes an approach to detecting inconsistent identifiers based on a custom code dictionary. It first automatically builds a Code Dictionary from the existing API documents of popular Java projects by using an Natural Language Processing (NLP) parser. This dictionary records domain words with dominant part-of-speech (POS) and idiom identifiers. This set of domain words and idioms can improve the accuracy when detecting inconsistencies by reducing false alarms. The approach then takes a target program and detects inconsistent identifiers of the program by leveraging the Code Dictionary. We provide CodeAmigo, a GUI-based tool support for our approach. We evaluated our approach on seven Java based open-/proprietary- source projects. The results of the evaluations show that the approach can detect inconsistent identifiers with 85.4 % precision and 83.59 % recall values. In addition, we conducted an interview with developers who used our approach, and the interview confirmed that inconsistent identifiers frequently and inevitably occur in most software projects. The interviewees then stated that our approach can help to better detect inconsistent identifiers that would have been missed through manual detection.  相似文献   

2.
To improve software quality, static or dynamic defect-detection tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static or dynamic defect-detection techniques to detect pattern violations in source code under analysis. However, these existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes new mining algorithms and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithms mine patterns in four pattern formats: conjunctive, disjunctive, exclusive-disjunctive, and combinations of these patterns. We show the benefits and limitations of these four pattern formats with respect to false positives and false negatives among detected violations by applying those patterns to the problem of detecting neglected conditions.  相似文献   

3.
代码克隆是指软件程序中一组相同或相近的代码片段,它广泛存在于软件中,因此如何发现代码克隆成为软件维护的一个重要问题。目前已有的克隆检测工具大多针对单一版本进行完整的克隆检测,然而对于大规模、复杂软件系统而言,在软件演化过程中随着代码的改变,不断重新检测代码克隆将花费较高的代价。针对这一问题,提出了一种基于分组的增量克隆检测方法。该方法根据前后两个版本的差异将源代码分为发生变化和未发生变化的两组,通过组内和组间的克隆分析实现增量的克隆检测。基于所提出的方法,在克隆检测工具CCFinderX的基础上实现了一个名为ICDBG(incremental clone detector based on grouping)的原型工具。实验证明,在变更较小时,该方法能够在保证正确性的同时显著减少克隆检测时间。  相似文献   

4.
汪昕  陈驰  赵逸凡  彭鑫  赵文耘 《软件学报》2019,30(5):1342-1358
开发人员经常需要使用各种应用程序编程接口(application programming interface,简称API)来复用已有的软件框架、类库等.由于API自身的复杂性、文档资料的缺失等原因,开发人员经常会误用API,从而导致代码缺陷.为了自动检测API误用缺陷,需要获得API使用规约,并根据规约对API使用代码进行检测.然而,可用于自动检测的API规约难以获得,而人工编写并维护的代价又很高.针对以上问题,将深度学习中的循环神经网络模型应用于API使用规约的学习及API误用缺陷的检测.在大量的开源Java代码基础上,通过静态分析构造API使用规约训练样本,同时利用这些训练样本搭建循环神经网络学习API使用规约.在此基础上,针对API使用代码进行基于上下文的语句预测,并通过预测结果与实际代码的比较发现潜在的API误用缺陷.对所提出的方法进行实现并针对Java加密相关的API及其使用代码进行了实验评估,结果表明,该方法能够在一定程度上实现API误用缺陷的自动发现.  相似文献   

5.
Many software libraries, especially those commercial ones, provide API documentation in natural languages to describe correct API usages. However, developers may still write code that is inconsistent with API documentation, partially because many developers are reluctant to carefully read API documentation as shown by existing research. As these inconsistencies may indicate defects, researchers have proposed various detection approaches, and these approaches need many known specifications. As it is tedious to write specifications manually for all APIs, various approaches have been proposed to mine specifications automatically. In the literature, most existing mining approaches rely on analyzing client code, so these mining approaches would fail to mine specifications when client code is not sufficient. Instead of analyzing client code, we propose an approach, called Doc2Spec, that infers resource specifications from API documentation in natural languages. We evaluated our approach on the Javadocs of five libraries. The results show that our approach performs well on real scale libraries, and infers various specifications with relatively high precisions, recalls, and F-scores. We further used inferred specifications to detect defects in open source projects. The results show that specifications inferred by Doc2Spec are useful to detect real defects in existing projects.  相似文献   

6.
In the lifetime of a software product, development costs are only the tip of the iceberg. Nearly 90% of the cost is maintenance due to error correction, adaptation and mainly enhancements. As Lehman and Belady [Lehman, M. M., & Belady, L. A. (1985). Program evolution: Processes of software change. Academic Press Professional.] state that software will become increasingly unstructured as it is changed. One way to overcome this problem is refactoring. Refactoring is an approach which reduces the software complexity by incrementally improving internal software quality. Our motivation in this research is to detect the classes that need to be rafactored by analyzing the code complexity. We propose a machine learning based model to predict classes to be refactored. We use Weighted Naïve Bayes with InfoGain heuristic as the learner and we conducted experiments with metric data that we collected from the largest GSM operator in Turkey. Our results showed that we can predict 82% of the classes that need refactoring with 13% of manual inspection effort on the average.  相似文献   

7.
Techniques for detecting defects in source code are fundamental to the success of any software development approach. A software development organization therefore needs to understand the utility of techniques such as reading or testing in its own environment. Controlled experiments have proven to be an effective means for evaluating software engineering techniques and gaining the necessary understanding about their utility. This paper presents a characterization scheme for controlled experiments that evaluate defect-detection techniques. The characterization scheme permits the comparison of results from similar experiments and establishes a context for cross-experiment analysis of those results. The characterization scheme is used to structure a detailed survey of four experiments that compared reading and testing techniques for detecting defects in source code. We encourage educators, researchers, and practitioners to use the characterization scheme in order to develop and conduct further instances of this class of experiments. By repeating this experiment we expect the software engineering community will gain quantitative insights about the utility of defect-detection techniques in different environments.This work was conducted while the author was with the Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.  相似文献   

8.
What Types of Defects Are Really Discovered in Code Reviews?   总被引:1,自引:0,他引:1  
Research on code reviews has often focused on defect counts instead of defect types, which offers an imperfect view of code review benefits. In this paper, we classified the defects of nine industrial (C/C++) and 23 student (Java) code reviews, detecting 388 and 371 defects, respectively. First, we discovered that 75 percent of defects found during the review do not affect the visible functionality of the software. Instead, these defects improved software evolvability by making it easier to understand and modify. Second, we created a defect classification consisting of functional and evolvability defects. The evolvability defect classification is based on the defect types found in this study, but, for the functional defects, we studied and compared existing functional defect classifications. The classification can be useful for assigning code review roles, creating checklists, assessing software evolvability, and building software engineering tools. We conclude that, in addition to functional defects, code reviews find many evolvability defects and, thus, offer additional benefits over execution-based quality assurance methods that cannot detect evolvability defects. We suggest that code reviews may be most valuable for software products with long life cycles as the value of discovering evolvability defects in them is greater than for short life cycle systems.  相似文献   

9.
Rodrigo Andrade  Paulo Borba 《Software》2020,50(10):1905-1929
In collaborative software development, developers submit their contributions to repositories that are used to integrate code from various collaborators. To avoid privacy and security issues, code contributions are often reviewed before integration. Although careful manual code review can detect such issues, it might be time-consuming, expensive, and error-prone. Automatic analysis tools can also detect privacy and security issues, but they often demand significant developer effort, or are domain specific, considering fixed framework specific vulnerability sources and sinks. To reduce these problems, in this paper we propose the Salvum policy language to support the specification of constraints that help to protect sensitive information from being inadvertently accessed by specific code contributions. We implement a tool that automatically checks Salvum policies for systems of different technical domains. We also investigate whether Salvum can find policy violations for a number of open-source projects. We find evidence that Salvum helps to detect violations even for well-supported and highly active projects. Moreover, our tool helps to find 80 violations in benchmark projects.  相似文献   

10.
Much current software defect prediction work focuses on the number of defects remaining in a software system. In this paper, we present association rule mining based methods to predict defect associations and defect correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. We applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The results show that, for defect association prediction, the accuracy is very high and the false-negative rate is very low. Likewise, for the defect correction effort prediction, the accuracy for both defect isolation effort prediction and defect correction effort prediction are also high. We compared the defect correction effort prediction method with other types of methods - PART, C4.5, and Naive Bayes - and show that accuracy has been improved by at least 23 percent. We also evaluated the impact of support and confidence levels on prediction accuracy, false-negative rate, false-positive rate, and the number of rules. We found that higher support and confidence levels may not result in higher prediction accuracy, and a sufficient number of rules is a precondition for high prediction accuracy.  相似文献   

11.
An essential goal for programmers is to minimize the cost of identifying and correcting defects in source code. Code review is commonly used for identifying programming defects. However, manual code review has some shortcomings: (1) it is time-consuming and (2) outcomes are subjective and depend on the skills of reviewers. An automated approach for assisting in code reviews is thus highly desirable. We present a tool for assisting in code review and results from our experiments evaluating the tool in different scenarios. The tool leveraged content available from professional programmer support forums (eg, StackOverflow.com) to determine potential defectiveness of a given piece of source code. The defectiveness is expressed on the scale of {Likely defective, neutral, unlikely to be defective}. The basic idea employed in the tool is (1) to identify a set P of discussion posts on Stack Overflow such that each pP contains source code fragment(s), which sufficiently resemble the input code C being reviewed, and (2) to determine the likelihood of C being defective by considering all pP. A novel aspect of our approach is to use document fingerprinting for comparing two pieces of source code. Our choice of document fingerprinting technique is inspired by source code plagiarism detection tools where it has proven to be very successful. In the experiments that we performed to verify the effectiveness of our approach, source code samples from more than 300 GitHub open-source repositories were taken as input. An F1 score of 0.94 has been achieved in identifying correct/relevant results.  相似文献   

12.
应用程序编程接口(Application Programming Interface,API)在软件开发以及代码复用中有着重要作用。然而,API代码和文档存在的不一致情况会误导API的使用者并降低软件开发效率及其稳定性等。针对Java API异常代码及其文档描述不一致的情况,提出了一种基于静态分析代码语法树及方法之间的调用关系的自动检测方法,为验证方法的有效性,利用JDK中的API源代码包及其相应文档作为测试对象根据实验结果。本方法的检测结果能达到71.5%的准确率以及85.9%的召回率,能够较为准确地识别API文档对程序异常描述不一致问题,对API文档的编写和维护具有指导性意义。  相似文献   

13.
Improving the efficiency of the testing process is a challenging goal. Prior work has shown that often a small number of errors account for the majority of software failures; and often, most errors are found in a small portion of a source code. We argue that prioritizing code elements before conducting testing can help testers focus their testing effort on the parts of the code most likely to expose errors. This can, in turn, promote more efficient testing of software. Keeping this in view, we propose a testing effort prioritization method to guide tester during software development life cycle. Our approach considers five factors of a component such as Influence value, Average execution time, Structural complexity, Severity and Value as inputs and produce the priority value of the component as an output. Once all components of a program have been prioritized, testing effort can be apportioned so that the components causing more frequent and/or more severe failures will be tested more thoroughly. Our proposed approach is effective in guiding testing effort as it is linked to external measure of defect severity and business value, internal measure of frequency and complexity. As a result, the failure rate is decreased and the chance of severe type of failures is also decreased in the operational environment. We have conducted experiments to compare our scheme with a related scheme. The results establish that our proposed approach that prioritizes the testing effort within the source code is able to minimize highly severed types of failures and also number of failures at the post-release time of a software system.  相似文献   

14.
Free and open source software (FOSS) plays an important role in source code reuse practice. They usually come with one or more software licenses written in the header part of source files, stating the requirements and conditions which should be followed when been reused. Removing or modifying the license statement by re-distributors will result in the inconsistency of license with its ancestor, and may potentially cause license infringement. In this paper, we describe and categorize different types of license inconsistencies and propose a method to detect them. Then we applied this method to Debian 7.5 and a collection of 10,514 Java projects on GitHub and present the license inconsistency cases found in these systems. With a manual analysis, we summarized various reasons behind these license inconsistency cases, some of which imply potential license infringement and require attention from the developers. This analysis also exposes the difficulty to discover license infringements, highlighting the usefulness of finding and maintaining source code provenance.  相似文献   

15.
Segmentation of volumetric data is an important part of many analysis pipelines, but frequently requires manual inspection and correction. While plenty of volume editing techniques exist, it remains cumbersome and errorprone for the user to find and select appropriate regions for editing. We propose an approach to improve volume editing by detecting potential segmentation defects while considering the underlying structure of the object of interest. Our method is based on a novel histogram dissimilarity measure between individual regions, derived from structural information extracted from the initial segmentation. Based on this information, our interactive system guides the user towards potential defects, provides integrated tools for their inspection, and automatically generates suggestions for their resolution. We demonstrate that our approach can reduce interaction effort and supports the user in a comprehensive investigation for high‐quality segmentations.  相似文献   

16.
Software security can be improved by identifying and correcting vulnerabilities. In order to reduce the cost of rework, vulnerabilities should be detected as early and efficiently as possible. Static automated code analysis is an approach for early detection. So far, only few empirical studies have been conducted in an industrial context to evaluate static automated code analysis. A case study was conducted to evaluate static code analysis in industry focusing on defect detection capability, deployment, and usage of static automated code analysis with a focus on software security. We identified that the tool was capable of detecting memory related vulnerabilities, but few vulnerabilities of other types. The deployment of the tool played an important role in its success as an early vulnerability detector, but also the developers perception of the tools merit. Classifying the warnings from the tool was harder for the developers than to correct them. The correction of false positives in some cases created new vulnerabilities in previously safe code. With regard to defect detection ability, we conclude that static code analysis is able to identify vulnerabilities in different categories. In terms of deployment, we conclude that the tool should be integrated with bug reporting systems, and developers need to share the responsibility for classifying and reporting warnings. With regard to tool usage by developers, we propose to use multiple persons (at least two) in classifying a warning. The same goes for making the decision of how to act based on the warning. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

17.
面向Java语言的设计模式抽取方法的研究   总被引:1,自引:0,他引:1  
从源码中抽取设计模式对于提高软件可理解性和可维护性、软件设计重用以及软件重构具有重要意义。文章面向Java语言提出了一个从源码中抽取设计模式的方法。具体地,研究了一种特定的设计模式描述方法、定义了源码信息模型及其化简方法,以此为基础提出了设计模式模型和源码模型的匹配方法。特别讨论了在抽取设计模式时与container类相关的问题及其解决方案。最后根据抽取结果从模式及其实例的角度对方法进行了评价,并提出了必要的优化技术。  相似文献   

18.
为了评估软件缺陷的风险,提出了一种基于复杂网络分析的软件缺陷评估方法。该方法首先用一个网络模型表达程序实体之间的关系,将源代码中的方法抽象为节点,方法间的调用关系抽象为有向边,以此构造程序源代码网络;然后分别用介数算法和PageRank算法计算造成软件缺陷的方法节点在源代码全局网络中的地位,由此评估缺陷的风险高低。实验结果表明,该方法在评估内部高危缺陷时有较好的效果,有助于提高软件开发维护人员对一些隐蔽高危缺陷的关注度,进而为后续修复缺陷与软件演化提供有益的线索。  相似文献   

19.
20.
ContextCode generators can automatically perform some tedious and error-prone implementation tasks, increasing productivity and quality in the software development process. Most code generators are based on templates, which are fundamentally composed of text expansion statements. To build templates, the code of an existing, tested and validated implementation may serve as reference, in a process known as templatization. With the dynamics of software evolution/maintenance and the need for performing changes in the code generation templates, there is a loss of synchronism between the templates and this reference code. Additional effort is required to keep them synchronized.ObjectiveThis paper proposes automation as a way to reduce the extra effort needed to keep templates and reference code synchronized.MethodA mechanism was developed to semi-automatically detect and propagate changes from reference code to templates, keeping them synchronized with less effort. The mechanism was also submitted to an empirical evaluation to analyze its effects in terms of effort reduction during maintenance/evolution templatization.ResultsIt was observed that the developed mechanism can lead to a 50% reduction in the effort needed to perform maintenance/evolution templatization, when compared to a manual approach. It was also observed that this effect depends on the nature of the evolution/maintenance task, since for one of the tasks there was no observable advantage in using the mechanism. However, further studies are needed to better characterize these tasks.ConclusionAlthough there is still room for improvement, the results indicate that automation can be used to reduce effort and cost in the maintenance and evolution of a template-based code generation infrastructure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号