期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

都志辉丁文魁郑耿斌李晓明许卓群《软件学报》1999,10(1):60-67

HPF(high performance Fortran)是一种典型的数据并行语言,HPF编译系统的实现是并行计算研究领域的一个难点.文章介绍了一个HPF编译系统的研究与实现情况,在对该系统的主要组成进行了简要介绍之后,着重讨论了系统实现中的若干关键技术,并列出了部分HPF源程序及其编译器生成的相应代码,最后给出了对该编译器的一些性能测试结果和有关问题的讨论. 相似文献

2.

LS SIMD C编译器的数据通信优化算法 总被引：1，自引：1，他引：0

王晖何华籼等《计算机科学》2001,28(9):116-118

1 引言当前理想的程序自动并行化系统的实现存在许多难于解决的问题,因此较为流行的并行计算方法是利用并行语言编写并行程序,编译器对并行程序进行编译生成相应的节点程序执行。并行语言按并行执行的粒度分为基于任务的并行语言(主要面向一般应用领域的计算)和数据并行语言(主要应用于科学数值计算),典型的数据并行语言如HPF。对于数据并行语言而言,程序执行的并行性已由程序设计人员根据程序中的数据相关性给出。因此,如何确定数据的分布、优化数据的通信是影响并行程序执行效率的重要问题。数据分布大致可以分为两个阶段:首先对源程序中数据的相关性分析得到数据在抽象处理机上的分布,然后将抽象处理机上的数据分布映射到物理处理机上。数据分布的确定通常有以下几种实现方式:一种是由程序员给出抽象数据分布,编译相似文献

3.

分布式系统中数据分解的研究

沈亚楠姚远张平赵荣彩罗向阳《计算机工程》2006,32(11):114-115,132

数据分解对消息传递并行机下的并行编译器取得高性能至关重要。根据编译器自动得出的数据分解（映射数据到处理机）信息，C语言版本的发送／接收消息循环嵌套可产生出来，从而在处理机之间实现分布数据。不仅一个已被证明且功能强大的数学模型用于产生数据分解代码，而且一个形式化的算法及其实现也已给出。初步实验结果显示该算法能显著提高性能。相似文献

4.

Cluster环境下p—HPF编译器支持的并行计算范式 总被引：2，自引：0，他引：2

胡长军余华山丁文魁许卓群《计算机研究与发展》2001,38(8):954-959

p-HPF是研制的一个符合HPF（high performance Fortran)规范的并行编译系统,以HPF为核心实现多范式并行计算是开发大型并行应用系统的基础。首先论述了Cluster环境下的并行运行范式,包括farm parallel范式、流水线并行、流循环并行、基于数据并行和组合数据并行等,抽象分析了它们的性能,接着给出了利用p-HPF的外部过程机制、任务并行机制以以FORALL,INDEPENDENT DO等典型并行语句实现几种典型并行范式的方法,给出了实例程序,对实例进行了实际运行并对运行结果进行了分析。相似文献

5.

基于相关性的同步优化算法 总被引：3，自引：1，他引：3

张平赵荣彩李清宝《计算机工程》2005,31(17):68-70

给出了一种基于数据相关图的同步优化算法，作为自动并行化编译器中的一个独立遍，利用并行化编译器对程序的相关性分析结果来实现编译时barrier同步优化。相似文献

6.

基于规范划分集的并行循环计算划分

下载免费PDF全文

黄其军杨建武余华山许卓群《软件学报》2003,14(3):362-368

计算划分问题是并行编译中最为重要的问题之一.针对并行循环,在数据分布确定的情况下,提出了基于规范集的计算划分算法,具体讨论了规范集的获取方法及综合通信与负载均衡的最优方案选取算法.实验表明,在并行循环处理方面,这一算法与以前几种算法相比更加简单、有效;采用这一算法的p_HPF编译器对数据并行应用问题可以获得良好的加速比和效率.该编译器已在石油领域得到应用. 相似文献

7.

HPF高性能语言在地震资料并行处理中的应用E-mail:zhangjh@mail.hdpu.edu.cn

张军华仝兆岐何潮观许卓群《计算机工程与应用》2002,(14)

HPF(HighPerformanceFortran)是HPF论坛1993年推出的一种高性能数据并行语言。文章利用合作单位开发的p_HPF并行编译系统,在地震资料处理中得到了大规模的工业应用。基准测试程序和实际资料处理结果表明,基于HPF的地震资料并行处理具有可编程性强、可移植性好和并行效率高等特点,具有很好的应用前景。相似文献

8.

HPF高性能语言在地震资料并行处理中的应用 总被引：1，自引：0，他引：1

张军华仝兆岐何潮观许卓群《计算机工程与应用》2002,38(14):38-39,181

HPF（High Performance Fortran)是HPF论坛1993年推出的一种高性能数据并行语言。文章利用合作单位开发的p-HPF并行编译系统,在地震资料处理中得到了大规模的工业应用,基准测试程序和实际资料处理结果表明,基于HPF的地震资料并行处理具有可编程性强,可移植性好和并行效率高等特点,具有很好的应用前景。相似文献

9.

基于HPF的地震资料并行处理系统研究与实现

张军华仝兆岐何潮观许卓群《计算机应用》2002,22(9):57-59

文中首先分析了地震资料并行处理的必要性，在介绍HPF高性能并行语言的特点后，着重讨论了p-HPF编译系统的体系结构与关键技术，然后，在分析地震资料自身特征和处理过程模块化程序设计的特点后，给出了地震资料处理的并行策略和一般SPMD程序的实现框图，最后，用一个简单的例子展示了HPF程序对于不同数据规模的并行效率，并用实际地震资料实现了大规模并行处理。相似文献

10.

数据并行语言编译系统的并行循环迭代分布算法

何连跃沈志宇《计算机工程与设计》1999,20(3):49-55,F004

讨论大规模并行机数据并行语言编译技术中的并行循环迭代分布算法。数据并行语言的数据分布方式有ＢＬＯＣＫ,ＢＬＯＣＫ（１）,ＢＬＯＣＫ（Ｎ）和：４种,而循环迭代分布是与数据分布对准的,文中给出与这些分布方式对应的循环迭代分布算法,算法允许确定数据分布方式的对准数组的下标可以是任意系数的一阶线性表达式;并行循环的循环增量可以为任意非零整数。相似文献

11.

支持并行模拟的Verilog编译技术研究与实现

李暾李思昆郭阳刘功杰《计算机工程与应用》2002,38(16):184-187

并行HDL模拟是加速大型复杂的VLSI系统模拟验证的有效方法，支持并行模拟的HDL编译技术是其中的关键技术，文章提出了一种支持并行模拟的Verilog编译技术，编译器将Verilog描述转换成C＋＋代码，最后与并行模拟核心库编译链接生成可执行并行程序。文章将编译器构成，代码生成方法和并行模拟核心库，该技术已经在并行Verilog模拟器ParaVer上实现。相似文献

12.

Adam: An Ada-based language for multiprocessing

D. C. Luckham F. W. Von Henke H. J. Larsen D. R. Stevenson 《Software》1984,14(7):605-642

Adam is a high-level language for parallel processing. It is intended for programming resource scheduling applications, in particular supervisory packages for run-time scheduling of multiprocessing systems. An important design goal was to provide support for implementation of Ada and its run-time environment. Adam has been used to implement Ada task supervision and also as a high-level target language for compilation of Ada tasking. Adam provides facilities corresponding to the Ada sequential constructs (including subprograms, packages, exceptions, generics). In addition, it provides specialized module constructs for implementation of packages that may be shared between parallel processes, and new predefined types for scheduling. The parallel processing constructs of Adam are more primitive than Ada tasking. Strong restrictions are enforced on the ways in which parallel processes can interact. A compiler for Adam has been implemented in MacLisp on DEC PDP-10 computers. Runtime support packages in Adam for scheduling (on a single CPU) and I/O are also provided. The compiler contains a library manipulation facility for separate compilation. The Adam compiler has been used to build an Ada compiler for most of the July 1980 Ada, including task types and rendezvous constructs. This was achieved by implementing the translation of Ada tasking into Adam parallel processing as a preprocessor to the Adam compiler. This present Ada compiler, which has been operational since December 1980, uses a procedure call implementation of tasking. It can be easily modified to other implementations. Compilation of Ada tasking into a high-level target language such as Adam facilitates studying questions of correctness and efficiency of various compilation algorithms, and code optimizations specific to tasking, e.g. elimination of unnecessary threads of control. This paper gives an overview of Adam and examples of its use. Emphasis is placed on the differences from Ada. Experience using Adam to build the experimental Ada system is evaluated. Design of a run-time supervisor in Adam is discussed in detail. 相似文献

13.

Compiling High Performance Fortran for distributed-memory architectures

Siegfried Benkner Hans Zima 《Parallel Computing》1999,25(13-14)

High Performance Fortran (HPF) is a data-parallel language that provides a high-level interface for programming scientific applications, while delegating to the compiler the task of generating explicitly parallel message-passing programs. This paper provides an overview of HPF compilation and runtime technology for distributed-memory architectures, and deals with a number of topics in some detail. In particular, we discuss distribution and alignment processing, the basic compilation scheme and methods for the optimization of regular computations. A separate section is devoted to the transformation and optimization of independent loops with irregular data accesses. The paper concludes with a discussion of research issues and outlines potential future development paths of the language. 相似文献

14.

Tilting at Windmills with Coq: Formal Verification of a Compilation Algorithm for Parallel Moves

Laurence Rideau Bernard Paul Serpette Xavier Leroy 《Journal of Automated Reasoning》2008,40(4):307-326

This article describes the formal verification of a compilation algorithm that transforms parallel moves (parallel assignments between variables) into a semantically-equivalent sequence of elementary moves. Two different specifications of the algorithm are given: an inductive specification and a functional one, each with its correctness proofs. A functional program can then be extracted and integrated in the Compcert verified compiler. 相似文献

15.

基于嵌套循环分类的并行识别技术

赵捷赵荣彩丁锐黄品丰《软件学报》2012,23(10):2695-2704

传统的分布存储并行编译系统大多是在共享存储并行编译系统的基础上开发的.共享存储并行编译系统的并行识别技术适合OpenMP代码生成,实现方式是将所有嵌套循环都按照相同的识别方法进行处理,用于分布存储并行编译系统必然会导致无法高效发掘程序的并行性.分布存储并行编译系统应根据嵌套循环结构的特点进行分类处理,提出适合MPI代码生成的并行识别技术.为解决上述问题,根据嵌套循环的结构和MPI并行程序的特点,提出了一种新的嵌套循环分类方法,并针对不同的嵌套循环分别提出了相应的并行识别技术.实验结果表明,与采用传统并行识别技术的分布存储并行编译系统相比,按照所提方法对嵌套循环进行分类,采用相应并行识别技术的编译系统能够更高效地识别基准程序中的并行循环,自动生成的MPI并行代码其性能加速比提高了20%以上. 相似文献

16.

Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs

Benkner Siegfried Sipkova Viera 《International journal of parallel programming》2003,31(1):3-19

Clusters of SMPs are hybrid-parallel architectures that combine the main concepts of distributed-memory and shared-memory parallel machines. Although SMP clusters are widely used in the high performance computing community, there exists no single programming paradigm that allows exploiting the hierarchical structure of these machines. Most parallel applications deployed on SMP clusters are based on MPI, the standard API for distributed-memory parallel programming, and thus may miss a number of optimization opportunities offered by the shared memory available within SMP nodes. In this paper we present extensions to the data parallel programming language HPF and associated compilation techniques for optimizing HPF programs on clusters of SMPs. The proposed extensions enable programmers to control key aspects of distributed-memory and shared-memory parallelization at a high-level of abstraction. Based on these language extensions, a compiler can adopt a hybrid parallelization strategy which closely reflects the hierarchical structure of SMP clusters by automatically exploiting shared-memory parallelism based on OpenMP within cluster nodes and distributed-memory parallelism utilizing MPI across nodes. We describe the implementation of these features in the VFC compiler and present experimental results which show the effectiveness of these techniques. 相似文献

17.

可扩展的自动并行化编译系统

沈勤华《计算机工程》2009,35(8):94-96

介绍一种可扩展的自动并行化编译系统Agassiz,研究其架构设计及关键特性。该系统可以把串行程序转换为并行程序,并为编译优化技术的研究提供良好的平台,通过面向对象的设计和实现,能有效集成各种并行优化技术。实验结果表明,该系统具有良好的可扩展性。相似文献

18.

Optimized Parallel Execution of Declarative Programs on Distributed Memory Multiprocessors

下载免费PDF全文

Shen Meiming Tian Xinmin Wang Dingxing Zheng Weimin Wen Dongchan 《计算机科学技术学报》1993,8(3):43-52

In this paper,we focus on the compiling implementation of parlalel logic language PARLOG and functional language ML on distributed memory multiprocessors.Under the graph rewriting framework, a Heterogeneous Parallel Graph Rewriting Execution Model(HPGREM)is presented firstly.Then based on HPGREM,a parallel abstact machine PAM/TGR is described.Furthermore,several optimizing compilation schemes for executing declarative programs on transputer array are proposed.The performance statistics on transputer array demonstrate the effectiveness of our model,parallel abstract machine,optimizing compilation strategies and compiler. 相似文献

19.

Compilation of Constraint Programs with Noncyclic and Cyclic Dependencies to Procedural Parallel Programs

Ajita John James C. Browne 《International journal of parallel programming》1998,26(1):65-119

This paper reports on a compiler for translation of constraint specifications into procedural parallel programs. A constraint program in our system consists of a set of constraints and an input set containing a subset of the variables appearing in the constraints. The compiler described in this paper successfully compiles a substantially larger class of constraint specifications to efficient programs than did its predecessors. In particular the compiler has been extended to generate processor and memory efficient programs for cyclic constraints which can be resolved by computational relaxation methods. The paper first details the basic compilation process for noncyclic constraints. It then describes the additional steps in the compilation process which enable resolution of cyclic constraints to iterative computational processes and illustrates the process using derivation of a parallel program for solution of the Laplace equation as the example. 相似文献