首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Program development often proceeds by transforming simple, clear programs into complex, involuted, but more efficient ones. This paper examines ways this process can be rendered more systematic. We show how analysis of program performance, partial evaluation of functions, and abstraction of recursive function definitions from recurring subgoals can be combined to yield many global transformations in a methodical fashion. Examples are drawn from compiler optimization, list processing, very high-evel languages, and APL execution.  相似文献   

2.
The application of a technique for labelling connected components based on the classical recursive technique is studied. The recursive approach permits labelling, counting, and characterizing objects with a single pass. Its main drawback lies on its very nature: Big objects require a high number of recursive calls, which require a large stack to store local variables and register values. Thus, the risk of stack overflow imposes an impractical limit on image size. The hybrid alternative combines recursion with iterative scanning and can be directly substituted into any program already using the recursive technique. I show how this alternative drastically reduces the number of consecutive recursive calls, and thus the required stack size, while improving overall performance. The method is tested on sets of uniform random binary images and binary images with a random distribution of overlapping square blocks. These test sets provide insight on the adequacy of the algorithm for different applications. The performance of the proposed technique is compared with the classical recursive technique and with an iterative two-pass algorithm using the Union-Find data structure, and the results show an overall increase of speed. The performance of the algorithm in real world machine vision applications is also shown.  相似文献   

3.
Many problems in formal verification and program analysis can be formalized as computing winning strategies for two-player games on graphs. In this paper, we focus on solving games in recursive game graphs which can model the control flow in sequential programs with recursive procedure calls. While such games can be viewed as the pushdown games studied in the literature, the natural notion of winning in our framework requires the strategies to be modular with only local memory; that is, resolution of choices within a module does not depend on the context in which the module is invoked, but only on the history within the current invocation of the module. While reachability in (global) pushdown games is known to be EXPTIME-complete, we show reachability in modular games to be NP-complete. We present a fixed-point computation algorithm for solving modular games such that in the worst case the number of iterations is exponential in the total number of returned values from the modules. If the strategy within a module does not depend on the global history, but can remember the history of the past invocations of this module, that is, if memory is local but persistent, we show that reachability becomes undecidable.  相似文献   

4.
David H. Hartley 《Software》1992,22(2):101-110
Random access to local variables stored in a stack frame is more difficult for compiled functions when the target processor lacks register-plus-offset addressing. One alternative technique employs a roving pointer which the program increments or decrements as needed between stack accesses. Processors which support auto-increment and auto-decrement addressing modes are often capable of performing these increments for free when consecutive accesses are to adjacent stack locations. For these processors, the compiler's chosen ordering for the local variables in the stack frame can substantially affect the execution speed of the compiled program. This paper is concerned with finding an ordering for the local variables in the frame that maximizes the likelihood that two consecutive references at run-time will be to the same or to adjacent stack locations. We have formulated a solution to this problem in terms of finding a Hamiltonian path in a graph. Although this problem is NP-complete in general, we have developed a heuristic algorithm that delivers good results with acceptable performance.  相似文献   

5.
函数内联(Function Inlining)是使用函数体代替函数调用的一种编译优化技术。LLVM中原生的内联模型只根据函数体的大小来判断函数内联与否,而没有考虑函数的调用次数和后续的优化。针对这个问题,提出了基于函数调用次数(NFC)和考虑后续循环合并(BLF)的新内联模型。首先,通过NFC模型对被多次调用的函数进行内联,进而减少更多因函数调用而产生的额外消耗。其次,通过BLF模型能够识别出进行内联后可以进一步进行循环合并优化的函数,为后续循环合并优化提供支持。实验结果表明,提出的函数内联优化技术是可行的,测试程序平均加速比为1.52%。  相似文献   

6.
分析实际程序时往往需要分析程序中函数的调用, 一般使用过程间分析来实现全程序分析.函数内联是一种最为精确、易于实现的过程间分析方法.通过函数内联, 可以使得已有过程内分析方法和工具支持包含函数调用的程序的分析.但是, 函数内联后代码的规模急剧增加, 同时将产生大量中间变量, 增加程序分析的变量维度, 导致程序分析过程时空开销大大增加.本文考虑基于抽象解释框架下函数内联过程间分析的一些不足, 并提出相应优化方法.基于抽象解释的程序分析关注自动推导程序变量之间的不变式约束关系, 因此程序变量构成的程序环境大小(即各程序点处须考虑的相关变量集合)对分析的时空开销具有重要影响.为了减少函数内联后程序分析的开销, 本文提出了面向内联函数块的程序环境降维优化方法.该方法针对内联函数后的程序代码, 分析确定不同程序点处需维护的程序环境(即相关变量集合), 而不是所有程序点共享同一全局程序环境, 从而实现程序状态的降维.详细描述了基于该方法所实现的工具DRIP (Dimension Reduction for analyzing function Inlined Program) 的架构、模块及算法细节.并在WCET Benchmarks测试集开展了分析实验, 实验结果表明: DRIP在变量消除上取得的效果良好, 甚至在某些测试集上能减少一半以上的变量, 并在一定程度上降低了分析过程的时空开销.  相似文献   

7.
We present the design and implementation of Arachne, a threads system that can be interfaced with a communications library for multithreaded distributed computations. In particular, Arachne supports thread migration between heterogeneous platforms, dynamic stack size management, and recursive thread functions. Arachne is efficient, flexible, and portable-it is based entirely on C and C++. To facilitate heterogeneous thread operations, we have added three keywords to the C++ language. The Arachne preprocessor takes as input code written in that language and outputs C++ code suitable for compilation with a conventional C++ compiler. The Arachne runtime system manages all threads during program execution. We present some performance measurements on the costs of basic thread operations and thread migration in Arachne and compare these to costs in other threads systems  相似文献   

8.
Tuning compiler optimizations for rapidly evolving hardware makes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization is a popular approach to adapting programs to a new architecture automatically using feedback-directed compilation. However, the large number of evaluations required for each program has prevented iterative compilation from widespread take-up in production compilers. Machine learning has been proposed to tune optimizations across programs systematically but is currently limited to a few transformations, long training phases and critically lacks publicly released, stable tools. Our approach is to develop a modular, extensible, self-tuning optimization infrastructure to automatically learn the best optimizations across multiple programs and architectures based on the correlation between program features, run-time behavior and optimizations. In this paper we describe Milepost GCC, the first publicly-available open-source machine learning-based compiler. It consists of an Interactive Compilation Interface (ICI) and plugins to extract program features and exchange optimization data with the cTuning.org open public repository. It automatically adapts the internal optimization heuristic at function-level granularity to improve execution time, code size and compilation time of a new program on a given architecture. Part of the MILEPOST technology together with low-level ICI-inspired plugin framework is now included in the mainline GCC. We developed machine learning plugins based on probabilistic and transductive approaches to predict good combinations of optimizations. Our preliminary experimental results show that it is possible to automatically reduce the execution time of individual MiBench programs, some by more than a factor of 2, while also improving compilation time and code size. On average we are able to reduce the execution time of the MiBench benchmark suite by 11% for the ARC reconfigurable processor. We also present a realistic multi-objective optimization scenario for Berkeley DB library using Milepost GCC and improve execution time by approximately 17%, while reducing compilation time and code size by 12% and 7% respectively on Intel Xeon processor.  相似文献   

9.
We introduce a new algebraic model for program variables, suitable for reasoning about recursive procedures with parameters and local variables in a mechanical verification setting. We give a predicate transformer semantics to recursive procedures and prove refinement rules for introducing recursive procedure calls, procedure parameters, and local variables. We also prove, based on the refinement rules, Hoare total correctness rules for recursive procedures, and parameters. We introduce a special form of Hoare specification statement which alone is enough to fully specify a procedure. Moreover, we prove that this Hoare specification statement is equivalent to a refinement specification. We implemented this theory in the PVS theorem prover.This work is based on an earlier work: Reasoning about recursive procedures with parameters. In Proceedings of the Workshop on Mechanized Reasoning about Languages with Variable Binding, 2003, Uppsala, Sweden, ACM Press.Received March 2004Revised October 2004Accepted February 2005 by C. B. Jones  相似文献   

10.
The Arachne aspect-oriented programming system developers modularize changes to networking software with little perceptible performance overhead. Writing good software is often a challenge; writing adaptable software can be even more difficult. In legacy Web caches such as Squid, such adaptation interfaces are typically needed for functionalities that are applied across the legacy code, functionalities whose code is scattered and tangled in the Web cache code's files and functions. An AOP system's join point model defines the relevant basic execution events of a given base application. Pointcuts allow programmers to refer to all such events, at which a functionality of interest is applied across, or crosscuts, the base application. At the execution events matched by pointcuts, advice can be used to modify the base application's execution. Arachne features two basic kinds of join points: C function calls and read/write accesses to global variables and their local aliases.  相似文献   

11.
We study transformations and equivalences of recursive program schemes. We give an optimization algorithm which recognizes and removes all parts of a program scheme which do not affect its final output. This result leads to a syntactic way of suppressing some erroneous loops in programs and can be used to prove that equivalence of recursive program schemes is solvable under particular conditions.  相似文献   

12.
In a dedicated, mixed-machine, heterogeneous computing (HC) system, an application program may be decomposed into subtasks, then each subtask assigned to the machine where it is best suited for execution. Data relocation is defined as selecting the sources for needed data items. It is assumed that multiple independent subtasks of an application program can be executed concurrently on different machines whenever possible. A theoretical stochastic model for HC Is proposed, in which the computation times of subtasks and communication times for intermachine data transfers can be random variables. The optimization problem for finding the optimal matching, scheduling, and data relocation schemes to minimize the total execution time of an application program is defined based on this stochastic HC model. The global optimization criterion and search space for the above optimization problem are described. It is validated that a greedy algorithm-based approach can establish a local optimization criterion for developing data relocation heuristics. The validation is provided by a theoretical proof based on a set of common assumptions about the underlying HC system and application program. The local optimization criterion established by the greedy approach, coupled with the search space defined for choosing valid data relocation schemes, can help developers of future practical data relocation heuristics  相似文献   

13.
Asynchronous programming is a paradigm that supports asynchronous function calls in addition to synchronous function calls. Programs in such a setting can be modeled by automata with counters that keep track of the number of pending asynchronous calls for each function, as well as a call stack for synchronous recursive computation. These programs have the restriction that an asynchronous call is processed only when the call stack is empty. The decidability of the control state reachability problem for such systems was recently established. In this paper, we consider the problems of checking other branching time properties for such systems. Specifically we consider the following problems — termination, which asks if there is an infinite (non-terminating) computation exhibited by the system; control state maintainability, which asks if there is a maximal execution of the system, where all the state visited lie in some “good” set; whether the system can be simulated by a given finite state system; and whether the system can simulate a given finite state system. We present decision algorithms for all these problems.  相似文献   

14.
一种结合自适应局部搜索的粒子群优化算法   总被引:1,自引:1,他引:0  
肖丽  张伟  张元清 《计算机科学》2007,34(8):199-201
本文提出一种结合自适应局部搜索的混合粒子群优化算法.该方法在粒子群优化算法的全局搜索过程中,使用能根据当前种群搜索状态自适应地调整局部搜索空间大小的局部搜索算法加强其局部搜索能力.采用了著名的基准函数对算法的性能进行测试,并与其他已有算法进行了比较.结果表明,这种混合粒子群优化算法能获得更高的搜索成功率和质量更好的解,特别在高维复杂函数优化上具有很强的竞争力.  相似文献   

15.
The execution of logic queries in a distributed database environment is studied. Conventional optimization strategies, such as the early evaluation of selection conditions and the clustering of processing to manipulate and exchange large sets of tuples, are redefined in view of the additional difficulties due to logic queries, in particular to recursive rules. In order to allow efficient processing of these logic queries, several program transformation techniques that attempt to minimize distribution costs based on the idea of semijoins and generalized semijoins in conventional databases are presented. Although local computation of semijoins is not possible for the general case, classes of programs are indicated for which these transformations succeed in producing set-oriented computation. Processes evaluating the recursive program in a distributed network are described, and an efficient method for testing the termination of the computation is developed. The approach is compared with sequential as well as dataflow-oriented evaluation  相似文献   

16.
杨霄 《微机发展》2004,14(3):102-103,119
递归函数独特的运算方式,使其在人工智能和各种事务处理过程中有着广泛的应用,因而成为一个重要的研究课题。文中以迷宫、汉诺塔等为例.根据计算机堆栈原理,具体讨论了用递归函数解题的方法和技巧。给出了递归函数调用时利用变量传递解决复杂问题的实例,展示了递归算法在解决非数值运算问题中的独特解题方式和效果。讨论表明,在求解人工智能和各种事务处理问题中.递归函数中合理地利用变量传递可有效地完成求解任务和提高程序的品质。  相似文献   

17.
一种新的并行文化微粒群优化算法   总被引:4,自引:2,他引:2       下载免费PDF全文
为了避免微粒群优化算法在解决复杂优化问题时陷入局部最优,提高算法种群的多样性。将微粒群优化算法纳入文化算法框架,提出了一种新的基于文化算法框架的并行微粒群优化算法。在文化算法框架中,由微粒群组成的群体空间和信念空间各自独立并行演化,并相互影响,有效地提高了种群的多样性,降低了陷入局部极值的可能性。通过对不同测试函数的仿真实验表明,新提出的并行文化微粒群优化算法比标准微粒群优化算法更容易找到全局最优解,提高了微粒群优化算法的全局寻优能力。  相似文献   

18.
蜂群—蚁群自适应优化算法*   总被引:1,自引:0,他引:1  
为了解决蚁群算法在求解连续函数优化问题时,存在局部搜索能力较差的缺陷,提出一种新颖的自适应蜂群—蚁群优化算法。新算法在蚁群优化算法的基础上,设计了一种参数q的自适应机制,进而减少了参数个数,提高了其鲁棒性;根据蜂群算法基本思想,利用雇佣蜂和观察蜂设计了高效的局部搜索算子,从而提升了算法的局部能力。针对五个标准测试函数的仿真实验结果表明:与蚁群优化算法相比,新算法的全局和局部寻优能力均得到了极大的提升。  相似文献   

19.
Calling context encoding (CCE) uses some integers to represent the current context of any execution point, and is valuable in a wide range of applications. Existing studies already ensure accurate results of CCE, but they use stack to store the encoding when recursive calls happen. This may cost much for cyclic call graphs with deep recursive calls. This paper presents a novel approach (called RCCE) to address this problem. With a modified strategy for encoding, it requires less frequent access of the stack and gives more succinct representation of contexts on deep recursion. Experimental results show that compared to Precise Calling Context Encoding, RCCE is more practical and efficient on calling contexts with recursions.  相似文献   

20.
A mechanically verified language implementation   总被引:1,自引:0,他引:1  
This paper briefly describes a programming language, its implementation on a microprocessor via a compiler and link-assembler, and the mechanically checked proof of the correctness of the implementation. The programming language, called Piton, is a high-level assembly language designed for verified applications and as the target language for high-level language compilers. It provides executeonly programs, recursive subroutine call and return, stack based parameter passing, local variables, global variables and arrays, a user-visible stack for intermediate results, and seven abstract data types including integers, data addresses, program addresses and subroutine names. Piton is formally specified by an interpreter written for it in the computational logic of Boyer and Moore. Piton has been implemented on the FM8502, a general purpose microprocessor whose gate-level design has been mechanically proved to implement its machine code interpreter. The FM8502 implementation of Piton is via a function in the Boyer-Moore logic which maps a Piton initial state into an FM8502 binary core image. The compiler and link-assembler are both defined as functions in the logic. The implementation requires approximately 36K bytes and 1400 lines of prettyprinted source code in the Pure Lisp-like syntax of the logic. The implementation has been mechanically proved correct. In particular, if a Piton state can be run to completion without error, then the final values of all the global data structures can be ascertained from an inspection of an FM8502 core image obtained by running the core image produced by the compiler and link-assembler. Thus, verified Piton programs running on FM8502 can be thought of as having been verified down to the gate level.This work was supported in part by the Defense Advanced Research Projects Agency under DARPA Orders 6082 and 9151, contract MDA904-87-C-H009.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号