首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 32 毫秒
1.
Compiling scientific code using partial evaluation   总被引:1,自引:0,他引:1  
Berlin  A. Weise  D. 《Computer》1990,23(12):25-37
The partial evaluation approach, which transforms a high-level program into a low-level program that is specialized for a particular application, exposing the parallelism inherent in the underlying numerical computation, is discussed. A prototype compiler that uses partial evaluation is described. Experiments with the compiler have shown that for an important class of numerical programs, partial evaluation can provide marked performance improvements: speedups over conventionally compiled code that range from seven times faster to 91 times faster have been measured. By coupling partial evaluation with parallel scheduling techniques, the low-level parallelism inherent in a computation can be exploited on heavily pipelined or parallel architectures. The approach has been demonstrated by applying a parallel scheduler to a partially evaluated program that simulates the motion of a nine-body solar system  相似文献   

2.
In this paper, we describe our experience of creating an OpenMP implementation of the SPICE3 circuit simulator program. Given the irregular patterns of access to dynamic data structures in the SPICE code, a parallelization using current standard OpenMP directives is impossible without major rewriting of the original program. The aim of this work is to present a case study showing the development of a shared memory parallel code with minimum effort. We present two implementations, one with minimal code modification and one without modification to the original SPICE3 program using Intel’s taskq construct. We also discuss the results of the case study in terms of what future compiler tools may be needed to help OpenMP application developers with similar porting goals. Our experiments using SPICE3, based on SRAM model simulation, were compiled by the SUN compiler running on a SunFire V880 UltraSPARC-III 750 MHz and by the Intel icc compiler running on both an IBM Itanium with four CPUs and Intel Xeon of two processors machines. The results are promising.  相似文献   

3.
针对Front相对Elegant属性语法规则在语义表达方面的不足,采用嵌入文本、增加可选结构等方式,对Front的语法和语义进行了有效扩展。用其开发了Modelica仿真建模语言编译器前端、编译了ScanGen、Diagrams及Front自身的编译前端,结果表明,所采用的扩展思路简便易行,扩展后的Front基本具备与Elegant属性语法相当的语义表达能力,能够满足复杂语言编译器前端的需要,且保持与扩展前版本的后向兼容性。  相似文献   

4.
投机优化技术作为一种先进的现代编译技术,有效地提高了指令执行的并行性。然而,在逆向工程中,有时要实现代码的跨平台移植,而投机优化技术又受硬件平台的制约;有时需要优化代码的结构,使程序的逻辑结构易于理解;这些都要求消除这种与硬件息息相关的优化技术。论文基于IA-64平台,提出了一种反投机处理算法,对ICC编译器编译后的可执行二进制代码进行处理,消除代码中的投机优化,将其转换成等价的没有投机优化的指令序列,这样使反投机后的代码更容易理解,而且在逆向工程中摆脱了硬件的限制。测试表明该反投机技术可以对ICC编译后的代码进行有效处理。  相似文献   

5.
An interactive facility for development and maintenance of Pascal programs is described. Features include interactive addition, deletion and modification of modules, independentcompilation of groups of procedures, and automatic recompilation of an entire program only if global declarations are modified. The implementation required modification of a Pascal compiler to process an arbitrary number of programs in a single run and to compile correctly if the main program is missing. Overall processing is managed by a parameterized control statement procedure that invokes several system utilities. The facility has been used successfully in instructional and research computing situations. Significant remaining problems are errors caused by mismatched parameters between independently compiled procedures, and interpretation by users of error messages generated by system utilities.  相似文献   

6.
可信编译理论及其核心实现技术:研究综述   总被引:1,自引:0,他引:1       下载免费PDF全文
编译器是重要的系统软件之一,高级语言编写的软件都必须经过编译器的编译才能成为可执行程序。编译器的可信性对于整个计算机系统而言具有非常关键的意义,如果编译器不可信,则很难保证系统所运行软件的可信性。可信编译是指编译器在保证编译正确的同时提供相应的机制保证编译对象的可信性,对可信编译理论和技术的研究具有重要理论意义和实用前景。阐述了可信编译器的概念,介绍了编译过程正确性的形式化定义,对可信编译的主要研究进行了概括。在全面分析可信编译研究现状的基础上,从编译器自身可信性和确保编译对象可信性两个方面,对可信编译器设计和实现的相关理论和方法进行了分类和总结。最后,讨论了可信编译有待解决的问题和未来的研究方向。  相似文献   

7.
This article proposes a programming language called “Espace” for parallel and distributed computation. In general, it is difficult to code a distributed, parallel program due to multi-threading, message passing, managing clients, and so on. Espace involves a few simple syntax rules added to Java. Developers do not need to know how to write a parallel, distributed program source code in detail. This work applies Espace to parallelize an evolutionary computation program, and shows that the Espace compiler allows the conversion of an evolutionary computation program written in Java into a distributed, parallel system by adding a few words to the program.  相似文献   

8.
文章[1]中提出了数组之间的数据融合优化方法,并以IA-32服务器为平台测试了数据融合优化的效果。测试结果表明,在IA-32机器上,数据融合优化在性能代价模型的控制下,能较好地改善具有非连续数据访问特征的应用程序的CACHE利用率。那么,在新一代体系结构IA-64平台上,数据融合优化的效果如何呢?该文分别以IntelIA-32服务器和HPITANIUM服务器为平台,用IntelFORTRAN编译器ifc和efc及自由软件编译器g95分别编译并运行数据融合优化变换前后的程序,获得两种平台上的执行时间及相关的性能数据。测试结果表明,源程序级的数据融合优化不能很好地与IA-64平台上的EFC编译器高级优化配合工作,在O3级优化开关控制下,优化效果是负值。此测试结果进一步表明,编译高级优化如数据预取、循环变换和数据变换等各种优化必须结合体系结构的特点统筹考虑,才能取得好的全局优化效果。该文为研究各种面向IA-32体系结构的编译优化算法在IA-64体系结构上的性能可移植性优化起到抛砖引玉的作用。  相似文献   

9.
Features of an explicitly parallel programming language targeted for reconfigurable parallel processing systems, where the machine's N processing elements (PEs) are capable of operating in both the SIMD and SPMD modes of parallelism, are described. The SPMD (single program-multiple data) mode of parallelism is a subset of the MIMD mode where all processors execute the same program. By providing all aspects of the language with an SIMD mode version and an SPMD mode version that are syntactically and semantically equivalent, the language facilitates experimentation with and exploitation of hybrid SIMD/SPMD machines. Language constructs (and their implementations) for data management, data-dependent control-flow, and PE-address-dependent control-flow are presented. These constructs are based on experience gained from programming a parallel machine prototype and are being incorporated into a compiler under development. Much of the research presented is applicable to general SIMD machines and MIMD machines  相似文献   

10.
对于影响用户交互响应速度的瓶颈代码段,现有即时编译器存在无法准确选取和在程序启动阶段没有可用的本地码进行加速的问题,这影响了即时编译技术在用户交互响应方面的加速效果。为此,对即时编译器原有的代码选择策略和编译模式进行了改进。在代码选择策略方面,应用程序可以根据实际运行情况主动选择要编译的代码段,保证所有影响用户交互响应速度的瓶颈代码段都能被选取并被加速;在编译模式方面,本次编译得到的本地码可以保存并供程序下次运行时使用,保证在程序启动阶段也有本地码可用来加速。应用程序启动速度的实验表明,改进的即时编译器能够提升1倍的用户响应速度。  相似文献   

11.
We investigate the compiler algorithms to support compiled communication in multiprocessor environments and study the benefits of compiled communication, assuming that the underlying network is an all-optical time-division-multiplexing (TDM) network. We present an experimental compiler, E-SUIF, that supports compiled communication for High Performance Fortran (HPF) like programs on all-optical TDM networks, and describe and evaluate the compiler algorithms used in E-SUIF. We further demonstrate the effectiveness of compiled communication on all-optical TDM networks by comparing the performance of compiled communication with that of a traditional communication method using a number of application programs.  相似文献   

12.
M. K. Crowe 《Software》1987,17(7):455-467
A system for dynamic compilation under the Unix operating system is described. The basis of the system is an incremental assembler that can be used statically or during program execution to insert or replace a module in an executable image. All cross-module references are via offets into a run-time symbol table. All generated code is independent of its location or the location of the symbol table. The symbol table and all modules reside in memory segments compatible with the memory allocator malloc() . The symbol table origin is maintained in a processor register. Library procedures allow the assembler (or C compiler) to be called to alter the currently executing program, or to place a stub function which acts as a trap, so that when the stub is invoked it caues a file to be dynamically compiled into the executing program to replace the stub with a bona fide procedure. This facilitates the construction of advanced interactive environments using native code. Some example applications, to Prolog and to incremental compilation, are considered.  相似文献   

13.
In this paper, a source to source parallelizing compiler system, AutoPar, is presentd. The system transforms FORTRAN programs to multi-level hybrid MPI/OpenMP parallel programs. Integrated parallel optimizing technologies are utilized extensively to derive an effective program decomposition in the whole program scope. Other features such as synchronization optimization and communication optimization improve the performance scalability of the generated parallel programs, from both intra-node and inter-node. The system makes great effort to boost automation of parallelization. Profiling feedback is used in performance estimation which is the basis of automatic program decomposition. Performance results for eight benchmarks in NPB1.0 from NAS on an SMP cluster are given, and the speedup is desirable. It is noticeable that in the experiment, at most one data distribution directive and a reduction directive are inserted by the user in BT/SP/LU. The compiler is based on ORC, Open Research Compiler. ORC is a powerful compiler infrastructure, with such features as robustness, flexibility and efficiency. Strong analysis capability and well-defined infrastructure of ORC make the system implementation quite fast.  相似文献   

14.
Performance prediction is an important engineering tool that provides valuable feedback on design choices in program synthesis and machine architecture development. We present an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions. Our approach is based on a performance simulation language called PAMELA. Apart from simulation, PAMELA features a symbolic analysis technique that enables PAMELA models to be compiled into symbolic performance models that trade prediction accuracy for the lowest possible solution cost. We demonstrate our approach through a large number of theoretical and practical modeling case studies, including six parallel programs and two distributed-memory machines. The average prediction error of our approach is less than 10 percent, while the average worst-case error is limited to 50 percent. It is shown that this accuracy is sufficient to correctly select the best coding or partitioning strategy. For programs expressed in a high-level, structured programming model, such as data-parallel programs, symbolic performance modeling can be entirely automated. We report on experiments with a PAMELA model generator built within a dataparallel compiler for distributed-memory machines. Our results show that with negligible program annotation, symbolic performance models are automatically compiled in seconds, while their solution cost is in the order of milliseconds.  相似文献   

15.
Kasi Anantha  Fred Long 《Software》1990,20(6):537-554
There are two principal methods used to exploit the parallelism available on a parallel machine: the program to be executed can be optimized by hand, or the program can be automatically converted to parallel machine code by a compiler. The first method usually derives parallelism at the procedure level; a parallel program is written in a high-level language and typically has various modules executing in parallel. By contrast, the compiler methodically transforms the program into parallel code using various transformations, such as code movement. The automatic conversion of a program to parallel code is called compaction or parallelization. This paper describes the evolution of a new compaction program and presents a new algorithm for determining legal code movements. A simulator of the target architecture was used to estimate the execution times of a sample suite of programs before and after compaction. The results verify that substantial advantages arise from applying this compaction technique.  相似文献   

16.
This paper reports on a compiler for translation of constraint specifications into procedural parallel programs. A constraint program in our system consists of a set of constraints and an input set containing a subset of the variables appearing in the constraints. The compiler described in this paper successfully compiles a substantially larger class of constraint specifications to efficient programs than did its predecessors. In particular the compiler has been extended to generate processor and memory efficient programs for cyclic constraints which can be resolved by computational relaxation methods. The paper first details the basic compilation process for noncyclic constraints. It then describes the additional steps in the compilation process which enable resolution of cyclic constraints to iterative computational processes and illustrates the process using derivation of a parallel program for solution of the Laplace equation as the example.  相似文献   

17.
A system based on the notion of a flow graph is used to specify formally and to implement a compiler for a lazy functional language. The compiler takes a simple functional language as input and generates C. The generated C program can then be compiled, and loaded with an extensive run-time system to provide the facility to experiment with different analysis techniques. The compiler provides a single, unified, efficient, formal framework for all the analysis and synthesis phases, including the generation of C. Many of the standard techniques, such as strictness and boxing analyses, have been included.  相似文献   

18.
Queue computing is an attractive alternative for the compulsive demand of high-performance architectures. Code generation for queue machines has some problems but the solutions have not been studied thoroughly. A new parallel queue computation model, 2-offset P-Code queue computation model, is presented together with a new code generation algorithm. The code generation algorithm takes leveled DAGs as input and produces 2-offset P-Code assembly. We also developed a queue compiler to evaluate the new algorithm and compiled a set of C language benchmark programs for the 2-offset P-Code. The queue compiler generates between 8.55% less instructions and 10.55% more instructions than an actual MIPS32 compiler for the compiled programs.  相似文献   

19.
CODE: a unified approach to parallel programming   总被引:1,自引:0,他引:1  
Browne  J.C. Azam  M. Sobek  S. 《Software, IEEE》1989,6(4):10-18
The authors describe CODE (computation-oriented display environment), which can be used to develop modular parallel programs graphically in an environment built around fill-in templates. It also lets programs written in any sequential language be incorporated into parallel programs targeted for any parallel architecture. Broad expressive power was obtained in CODE by including abstractions of all the dependency types that occur in the widely used parallel-computation models and by keeping the form used to specify firing rules general. The CODE programming language is a version of generalized dependency graphs designed to encode the unified parallel-computation model. A simple example is used to illustrate the abstraction level in specifying dependencies and how they are separated from the computation-unit specification. The most important CODE concepts are described by developing a declarative, hierarchical program with complex firing rules and multiple dependency types  相似文献   

20.
为了解决面向方面编程中的方面冲突问题,在分析现有解决方法的基础上,提出了一种基于契约式设计的方面冲突自动检测方案。根据设计文档使用JML给方面和基础程序标注契约,利用契约转换程序生成契约检查程序,契约检查程序与面向方面的应用程序一起编译,生成包含契约检查的目标文件,从而在程序执行时,自动检测出方面与基础程序间的冲突以及方面与方面间冲突。该方案不破坏现有的应用程序,且无需重新设计编译器。通过一个实例表明该方案的可行性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号