首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
陈翔  王秋萍 《计算机科学》2018,45(6):161-165
基于代码修改的缺陷预测,具有代码审查量少、缺陷定位和修复快的优点。文中首次将该问题建模为多目标优化问题,其中一个优化目标是最大化识别出的缺陷代码修改数,另一个优化目标是最小化需要审查的代码量。这两个优化目标之间存在一定的冲突,因此提出了MULTI方法,该方法可以生成一组具有非支配关系的预测模型。在实证研究中,考虑了6个大规模开源项目(累计227417个代码修改),以ACC和POPT作为评测预测性能的指标。实验结果表明,MULTI方法的预测性能均显著优于经典的有监督建模方法(EALR和Logistic)和无监督建模方法(LT和AGE)。  相似文献   

2.
Today's massively parallel machines are typically message-passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task, and poor parallel design decisions can be expensive to correct. Tools and techniques that allow the fast and accurate evaluation of different parallelization strategies would significantly improve the productivity of application developers and increase throughput on parallel architectures. This paper investigates one of the major issues in building tools to compare parallelization strategies: determining what type of performance models of the application code and of the computer system are sufficient for a fast and accurate comparison of different strategies. The paper is built around a case study employing the performance prediction tool (PerPreT) to predict performance of the parallel spectral transform shallow water model code (PSTSWM) on the Intel Paragon. PSTSWM is a parallel application code that was designed to evaluate different parallel strategies for the spectral transform method as it is used in climate modeling and weather forecasting. Multiple parallel algorithms and algorithm variants are embedded in the code. PerPreT uses a relatively simple algebraic model to predict execution time for SPMD (single program multiple data) parallel applications. Applications are modeled through parameterized formulae for communication and computation, where the parameters include the problem size, the number of processors used to execute the program, and system characteristics (e.g. setup times for communication, link bandwidth and sustained computing performance per processor). In this paper we describe performance models that predict the performance of the different algorithms in PSTSWM accurately enough to allow them to be compared, establishing the feasibility of such a demanding application of performance modeling. We also discuss issues in generating and validating the performance models, emphasizing the practical importance of tools such as PerPreT in such studies. © 1998 John Wiley & Sons, Ltd.  相似文献   

3.
原子  于莉莉  刘超 《软件学报》2014,25(11):2499-2517
软件在其生命周期中不断地发生变更,以适应需求和环境的变化。为了及时预测每次变更是否引入了缺陷,研究者们提出了面向软件源代码变更的缺陷预测方法。然而现有方法存在以下3点不足:(1)仅实现了较粗粒度(事务级和源文件级变更)的预测;(2)仅采用向量空间模型表征变更,没有充分挖掘蕴藏在软件库中的程序结构、自然语言语义以及历史等信息;(3)仅探讨较短时间范围内的预测,未考虑在长时间软件演化过程中由于新需求或人员重组等外界因素所带来的概念漂移问题。针对现有的不足,提出一种面向源代码变更的缺陷预测方法。该方法将细粒度(语句级)变更作为预测对象,从而有效降低了质量保证成本;采用程序静态分析和自然语言语义主题推断相结合的技术深入挖掘软件库,从变更的上下文、内容、时间以及人员4个方面构建特征集,从而揭示了变更易于引入缺陷的因素;采用特征熵差值矩阵分析了软件演化过程中概念漂移问题的特点,并通过一种伴随概念回顾的动态窗口学习机制实现了长时间的稳定预测。通过6个著名开源软件验证了该方法的有效性。  相似文献   

4.
作为人工智能工程化的实现工具,智能计算框架已在近年来被广泛应用,其可靠性对于人工智能的有效实现至关重要.然而,智能计算框架的可靠性保障具有挑战性,一方面,智能计算框架代码迭代迅速、测试困难;另一方面,与传统软件不同,智能计算框架涉及大量张量计算,其代码规范缺乏软件工程理论指导.为了解决这一问题,现有的工作主要使用模糊测试手段实现缺陷定位,然而,这类方法只能实现特定类型缺陷的精准定位,却难以即时地在开发过程中引导开发者关注软件质量.因此,将国内外常见的智能计算框架(TensorFlow,百度飞桨等)作为研究对象,选取多种变更特征构建数据集,在代码提交级别对智能计算框架进行即时缺陷预测.另外,在此基础上使用LDA主题建模技术挖掘代码和代码提交信息作为新的特征,并使用随机森林进行预测.结果发现AUC-ROC平均值为0.77,且语义信息可以略微提升预测性能.最后,使用可解释机器学习方法 SHAP分析各特征属性对模型预测输出的影响,发现:(1)基本特征对于模型的影响符合传统软件开发规律;(2)代码和提交信息中的语义特征对模型的预测结果有重要影响;(3)不同系统中的不同特征对模型预测输出的贡献度排...  相似文献   

5.
汪昕  陈驰  赵逸凡  彭鑫  赵文耘 《软件学报》2019,30(5):1342-1358
开发人员经常需要使用各种应用程序编程接口(application programming interface,简称API)来复用已有的软件框架、类库等.由于API自身的复杂性、文档资料的缺失等原因,开发人员经常会误用API,从而导致代码缺陷.为了自动检测API误用缺陷,需要获得API使用规约,并根据规约对API使用代码进行检测.然而,可用于自动检测的API规约难以获得,而人工编写并维护的代价又很高.针对以上问题,将深度学习中的循环神经网络模型应用于API使用规约的学习及API误用缺陷的检测.在大量的开源Java代码基础上,通过静态分析构造API使用规约训练样本,同时利用这些训练样本搭建循环神经网络学习API使用规约.在此基础上,针对API使用代码进行基于上下文的语句预测,并通过预测结果与实际代码的比较发现潜在的API误用缺陷.对所提出的方法进行实现并针对Java加密相关的API及其使用代码进行了实验评估,结果表明,该方法能够在一定程度上实现API误用缺陷的自动发现.  相似文献   

6.
This paper presents an application of fuzzy-logic techniques to the reversible compression of grayscale images. With reference to a spatial differential pulse code modulation (DPCM) scheme, prediction may be accomplished in a space-varying fashion either as adaptive, i.e., with predictors recalculated at each pixel, or as classified, in which image blocks or pixels are labeled in a number of classes, for which fitting predictors are calculated. Here, an original tradeoff is proposed; a space-varying linear-regression prediction is obtained through fuzzy-logic techniques as a problem of matching pursuit, in which a predictor different for every pixel is obtained as an expansion in series of a finite number of prototype nonorthogonal predictors, that are calculated in a fuzzy fashion as well. To enhance entropy coding, the spatial prediction is followed by context-based statistical modeling of prediction errors. A thorough comparison with the most advanced methods in the literature, as well as an investigation of performance trends and computing times to work parameters, highlight the advantages of the proposed fuzzy approach to data compression.  相似文献   

7.
Performance prediction is an important engineering tool that provides valuable feedback on design choices in program synthesis and machine architecture development. We present an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions. Our approach is based on a performance simulation language called PAMELA. Apart from simulation, PAMELA features a symbolic analysis technique that enables PAMELA models to be compiled into symbolic performance models that trade prediction accuracy for the lowest possible solution cost. We demonstrate our approach through a large number of theoretical and practical modeling case studies, including six parallel programs and two distributed-memory machines. The average prediction error of our approach is less than 10 percent, while the average worst-case error is limited to 50 percent. It is shown that this accuracy is sufficient to correctly select the best coding or partitioning strategy. For programs expressed in a high-level, structured programming model, such as data-parallel programs, symbolic performance modeling can be entirely automated. We report on experiments with a PAMELA model generator built within a dataparallel compiler for distributed-memory machines. Our results show that with negligible program annotation, symbolic performance models are automatically compiled in seconds, while their solution cost is in the order of milliseconds.  相似文献   

8.
为了克服单DSP码激励线性预测语音系统通用性差、双处理器系统(ARM和DSP)码激励线性预测语音设计成本高和硬件接口设计复杂及稳定性低等问题,提出使用单片S3c2410处理器芯片实现码激励线性预测语音系统;包括算法分析,系统硬件平台设计和系统软件设计.实验结果表明,在不降低系统语音性能的同时,采用单片S3c2410处理器,能够提高系统通用性和稳定性,降低设计的复杂性和成本.  相似文献   

9.
Value prediction, a technique to break data dependency, is important in enhancing instruction-level parallelism and processor performance. A new value predictor utilizing both the loop and locality properties of data values has been proposed in this paper to pursue desirable prediction accuracy at reasonable cost. The proposed value predictor, called the Dynamic Loop and Locality-based (DLL) predictor, makes predictions by dynamically practicing the loop or locality-based prediction policy according to the state. With certain simple designs, the DLL predictor gains prediction accuracy in an efficient way. To secure more comprehensive experimental evaluation of value predictors, a new performance measure, accuracy improvement per cost, briefed as the A/C ratio, is introduced in the paper. Simulation results show that, compared with other existing value predictors, the proposed DLL predictor produces better A/C ratios in almost all situations due to flexible application of different prediction policies and reduced cost.  相似文献   

10.
王欢  张丽萍  闫盛 《计算机应用》2016,36(12):3468-3475
针对克隆代码有害性预测中有害和无害数据分类不平衡的问题,提出一种基于随机下采样(RUS)的能够自动调整分类不平衡的K-Balance算法。首先对克隆代码提取静态特征和演化特征构建样本数据集;然后选取比例不同的分类不平衡新数据集;接着对已选取的新数据集进行有害性预测;最后,通过观察分类器的不同表现自动选择一个最适合的分类不平衡比例值。在7款C语言开源软件共170个版本上对克隆有害性预测模型的性能进行评估,并和其他分类不平衡解决方法进行对比,实验结果表明所提方法对有害和无害克隆的分类预测效果(受试者工作特征曲线下方面积(AUC)值)提高了2.62个百分点~36.70个百分点,能有效地改善分类不平衡的预测问题。  相似文献   

11.
Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be eventually abandoned, which causes waste of time and effort. Thus, a tool that predicts early on whether a code change will be merged can help developers prioritize changes to inspect, accomplish more things given tight schedule, and not waste reviewing effort on low quality changes. In this paper, motivated by the above needs, we build a merged code change prediction tool. Our approach first extracts 34 features from code changes, which are grouped into 5 dimensions: code, file history, owner experience, collaboration network, and text. And then we leverage machine learning techniques such as random forest to build a prediction model. To evaluate the performance of our approach, we conduct experiments on three open source projects (i.e., Eclipse, LibreOffice, and OpenStack), containing a total of 166,215 code changes. Across three datasets, our approach statistically significantly improves random guess classifiers and two prediction models proposed by Jeong et al. (2009) and Gousios et al. (2014) in terms of several evaluation metrics. Besides, we also study the important features which distinguish merged code changes from abandoned ones.  相似文献   

12.
张献  贲可荣  曾杰 《软件学报》2021,32(7):2219-2241
软件缺陷预测是软件质量保障领域的一个活跃话题,它可以帮助开发人员发现潜在的缺陷并更好地利用资源.如何为预测系统设计更具判别力的度量元,并兼顾性能与可解释性,一直是人们致力于研究的方向.针对这一挑战,提出了一种基于代码自然性特征的缺陷预测方法——CNDePor.该方法通过正逆双向度量代码并利用质量信息对样本加权的方式改进...  相似文献   

13.
This paper proposes an enhanced method of multiple branch prediction using a per-primary branch history table. This scheme improves the previous ones based on a single global branch history register, by reducing interferences among histories of different branches caused by sharing a single register. This scheme also allows the prediction of a branch not to affect the prediction of other branches that are predicted in the same cycle, thus allowing independent and parallel prediction of multiple branches. Our experimental results indicate that these features help to achieve higher prediction accuracy than that of the previous global history scheme (which is already high) with the less hardware cost (i.e., 96.1% vs. 95.1% for integer code and 95.7% vs. 94.9% for floating-point code including nasa7, for a given hardware budget of 128K bits). Moreover, the increased prediction accuracy causes better fetch bandwidth of a superscalar machine (i.e., 7.1 vs. 6.9 instructions per clock cycle for integer code and 11.0 vs. 10.9 instructions per cycle for floating-point code).  相似文献   

14.
15.
人工智能(artificial intelligence, AI)技术的发展为源码处理场景下AI系统提供了强有力的支撑.相较于自然语言处理,源码在语义空间上具有特殊性,源码处理相关的机器学习任务通常采用抽象语法树、数据依赖图、控制流图等方式获取代码的结构化信息并进行特征抽取.现有研究通过对源码结构的深入分析以及对分类器的灵活应用已经能够在实验场景下获得优秀的结果.然而,对于源码结构更为复杂的真实应用场景,多数源码处理相关的AI系统出现性能滑坡,难以在工业界落地,这引发了从业者对于AI系统鲁棒性的思考.由于基于AI技术开发的系统普遍是数据驱动的黑盒系统,直接衡量该类软件系统的鲁棒性存在困难.随着对抗攻击技术的兴起,在自然语言处理领域已有学者针对不同任务设计对抗攻击来验证模型的鲁棒性并进行大规模的实证研究.为了解决源码处理场景下AI系统在复杂代码场景下的不稳定性问题,提出一种鲁棒性验证方法 (robustness verification by Metropolis-Hastings attack method, RVMHM),首先使用基于抽象语法树的代码预处理工具提取模型的变量池,然后利...  相似文献   

16.
在用于构建深度学习模型的深度学习框架中,算子的正确计算对于深度学习模型的正确预测至关重要.然而,已有的深度学习框架缺陷检测方法只能通过比较和推测的方式找到不同深度学习框架之间计算结果相差较大的算子,而且无法检测深度学习模型在训练过程中产生的计算错误,具有很大的局限性.针对此问题,本文设计并实现了基于元算子的深度学习框架...  相似文献   

17.
为了提高Web开发效率,开发人员常常复用已有系统框架或成熟项目中现有的代码,但因此也导致了Web应用中总存在大量的冗余代码,冗余代码不仅影响程序的可读性和运行效率同时还会隐藏软件缺陷。通过研究Web应用源代码逻辑和框架的特性,提出了Web应用系统中基于源代码分析的冗余代码检测方法。从应用程序入口开始,根据代码之间的逻辑调用关系构建Web应用调用树,进而得到有效页面集、有效类与方法节点集;然后根据冗余检测算法检测出Web应用系统中冗余页面、冗余处理类与处理方法。为了评估冗余检测方法的有效性,包括漏检率与误检率,对两个JavaWeb应用进行冗余检测并通过人工注入冗余实验验证检测的有效性。实验结果证明,提出的冗余代码检测方法可以达到较高的检测效率。  相似文献   

18.
This paper describes an approach for combining the classifications or predictions of n local experts into a single composite prediction. We describe a Java-based application that allows a user to select up to n prediction experts that provide information for assigning an object to one of two predetermined groups. An advantage of this type of application is that it is capable of interacting with the Internet in a relatively seamless way. We examine the accuracy and robustness of our technique by comparing the classification accuracy of our technique, a maximum entropy-based aggregation technique, and four classification methods on a real-world, two-group data-set concerned with bank failure prediction. The classificaiton methods studied in this work include Quinlan's C4.5 decision-tree classifier, logistic regression, mahalanobis distance measures, and a neural network classifier. Our model includes a fundamental component (i.e., a transaction manager) that helps improve the general performance of applications that perform network-based classification. This component is found to provide reliable and secure connections along with ways to direct traffic across the Internet. Our results suggest three major contributions: (1) a transaction manager increases the flexibility of a network-based classifier since it is capable of transacting with one or more specific types of prediction expert(s) over the Internet; (2) our approach tends to be more accurate than the individual classification methods we examined; and, (3) our approach can outperform a recently introduced statistically based aggregation technique.Scope and purposeThe emergence of the Internet has produced a need for employing new types of programming and research tools that are capable of accessing information resources located throughout the world. There is only a limited amount of research available in this area and this work describes a network-based tool that solves a two-group classification problem. The two-group classification problem in discriminant analysis is concerned with developing a rule for predicting to which of k=2 mutually exclusive groups an observation of unknown origin belongs. This problem commonly occurs in business and other areas, and a plethora of statistical and artificial intelligence (AI) techniques exist to help decision-makers effectively analyze their data. A number of recent studies have compared the classificatory performance of various AI techniques to the more traditional statistical techniques, however, decision makers are left in somewhat of a quandary about which of the many available classification techniques to use to solve a specific classification problem. This paper proposes a new aggregation technique that focuses on combining or aggregating the predictions from multiple classification techniques into a single composite prediction. Our approach provides a simple method for aggregating expert predictions coming from remote locations by combining Java and Common Object Request Broker Architecture (CORBA) into a general classification tool. Object-oriented models developed using Java are platform independent and can be easily modified. CORBA provides the services necessary to establish and manage network connections. Computational results show that our technique outperforms a recently introduced maximum entropy-based aggregation technique using a real-world data set.  相似文献   

19.
Dale Parson  Zhenyu Zhu 《Software》2000,30(15):1641-1660
The JavaTM Native Interface (JNI) provides a set of mechanisms for implementing Java methods in C or C++. JNI is useful for reusing C and C++ code repositories within Java frameworks. JNI is also useful for real‐time systems, where compiled C/C++ code executes performance‐critical tasks, while Java code executes system control and feature tasks. Available JNI literature concentrates on creating Java proxy classes that allow Java clients to interact with C++ classes. Current JNI literature does not discuss Java proxies for entire C++ inheritance hierarchies; that is the topic of this paper. Our experience in reusing C++ class hierarchies within a Java framework has uncovered a set of useful techniques for constructing Java proxy class hierarchies that mirror their C++ counterparts. This report gives both high level design guidelines and specific programming idioms for constructing Java class hierarchies that serve as proxies for C++ counterparts. We begin by discussing opportunities for reuse within a proxy class hierarchy, as well as problems caused by differences between the Java and C++ approaches to inheritance. The two most significant differences are due to C++ support for invocation of a member function based on the static type of its class, and C++ support for multiple implementation inheritance. Two example C++ class hierarchies provide the basis for a set of sections that present the design guidelines and that codify the programming idioms. This work could serve as the basis for an automatic generator of Java proxy class hierarchies. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

20.
Because they are interpreted, Java executables run slower than their compiled counterparts. The native executable translation (NET) compiler's objective is to optimize the translation of Java byte-code to native machine code so that it runs nearly as fast as native code generated directly from a source. The article presents some preliminary results for several large application programs and standard benchmarks. It compares the NET-compiled code performance with Sun's Java VM, Microsoft's Java just-in-time compiler, and equivalent C and C++ programs directly compiled. The results show that the optimizing NET compiler is capable of achieving better performance than the two other byte-code execution methods, in some cases achieving speeds comparable to directly compiled native code  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号