共查询到20条相似文献,搜索用时 15 毫秒
1.
实用数据依赖分析方法 总被引:1,自引:1,他引:1
数据依赖分析是检测程序循环级并行的基本步骤,基于数组下标对分类,本文提出了一个实用,有效的数据依赖分析方案。现有的依赖测试算法,都有循环正规化的假设,由于它存在某些弊端,我们抛弃这一假设,允许循环增量是任意整表达式,为此,本文对有关依赖的定义做了适当修改,并重新推导了某些重要结论,为处理循环增量为变量或表达式的情形,给出了弱形式下的GCD和Banerjee测试,该方案已在PORT中实现。 相似文献
2.
3.
Symbolic analysis is of paramount importance for parallelizing compilers and performance estimators to examine symbolic expressions with program unknowns such as machine and problem sizes and to solve queries based on systems of constraints (equalities and inequalities). This paper describes novel techniques for counting the number of solutions to a system of constraints, simplifying systems of constraints, computing lower and upper bounds of symbolic expressions, and determining the relationship between symbolic expressions. All techniques target wide classes of linear and non-linearsymbolic expressions and systems of constraints. Our techniques have been implemented and are used as part of a parallelizing compiler and a performance estimator to support analysis and optimization of parallel programs. Various examples and experiments demonstrate the effectiveness of our symbolic analysis techniques. 相似文献
4.
5.
Fahringer T. Scholz B. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(11):1105-1125
The quality of many optimizations and analyses of parallelizing compilers depends significantly on the ability to evaluate symbolic expressions and on the amount of information available about program variables at arbitrary program points. In this paper, we describe an effective and unified symbolic evaluation framework that statically determines the values of variables and symbolic expressions, assumptions about and constraints between variable values, and the condition under which control flow reaches a program statement. We introduce the program context, a novel representation for comprehensive and compact control and data flow analysis information. Program contexts are described as first order logic formulas, which allows us to use public domain software for standard symbolic manipulation. Computations are represented as algebraic expressions defined over a program's problem size. Our symbolic evaluation techniques comprise accurate modeling of assignment and input/output statements, branches, loops, recurrences, arrays, and procedures. All of our techniques target both linear, as well as nonlinear, expressions and constraints. Efficiency of symbolic evaluation is highly improved by aggressive simplification techniques. A variety of examples, including program verification, dependence analysis, array privatization, communication vectorization, and elimination of redundant communication, are used to illustrate the effectiveness of our approach. We present results from a preliminary implementation of our framework, which is used as part of a parallelizing compiler that demonstrates the potential performance gains achievable by employing symbolic evaluation to support program parallelization. 相似文献
6.
7.
Sungdo Moon Byoungro So Hall M.W. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(1):36-49
This paper presents the results of an experiment to measure empirically the remaining opportunities for exploiting loop-level parallelism that are missed by the Stanford SUIF compiler, a state-of-the-art automatic parallelization system targeting shared-memory multiprocessor architectures. For the purposes of this experiment, we have developed a run-time parallelization test called the Extended Lazy Privatizing Doall (ELPD) test, which is able to simultaneously test multiple loops in a loop nest. The ELPD test identifies a specific type of parallelism where each iteration of the loop being tested accesses independent data, possibly by making some of the data private to each processor. For 29 programs in three benchmark suites, the ELPD test was executed at run time for each candidate loop left unparallelized by the SUIF compiler to identify which of these loops could safely execute in parallel for the given program input. The results of this experiment point to two main requirements for improving the effectiveness of parallelizing compiler technology: incorporating control flow tests into analysis and extracting low-cost run-time parallelization tests from analysis results 相似文献
8.
The power test for data dependence 总被引:1,自引:0,他引:1
A data dependence decision algorithm called the power test is introduced. The power test is a combination of the extended GCD algorithm and the Fourier-Motzkin method to eliminate variables in a system of inequalities. This is the first test that can generate the information needed for some advanced transformations, and that can handle complex simultaneous loop limits. Previous work in data dependence decision algorithms is reviewed. Some examples which motivated the development of this test are examined, including those which demonstrate the additional power of the power test. Although it may be too expensive for use as a general-purpose dependence test in a compiler, the power test has proved useful in an interactive program restructuring environment 相似文献
9.
循环是程序中蕴含并行性最为丰富的一种结构,因此成为并行化编译最主要的对象.但循环内的过程调用严重妨碍了循环的数据相关性分析,使得循环语句潜在的大量并行性得不到开发.本文提出的循环嵌入方法使部分含过程调用循环语句的并行化成为可能,对部分用其它过程间分析技术也能开发其并行性的这一类循环语句采用循环嵌入方法,并行化开销低,并且分析更精确.采用循环嵌入方法还可降低程序由于多次过程调用带来的调度开销.这一方法在作者开发的自动并行化编译系统AFT(automaticPortrantransformer)中得到了实现,对Spec92测试程序包的试验结果表明了本文提出的方法是行之有效的. 相似文献
10.
11.
12.
13.
Generation of efficient parallel code is a major goal of a well-designed and developed parallelizing compiler. Another important goal is portability of both compiler system and the resulting output source codes. The various choices of current and future parallel computer architectures as well as the cost of developing a parallelizing compiler make portability a very important design goal. Since the design of parallelizing compilers is considerably move complex than designing conventional compilers, it is very important to achieve both efficiency and portability. To meet this dual goal, we have investigated the application of object oriented design to parallelizing compilers. Our parallelizing compiler design is based on abstractions of intermediate representations of loops and their class definitions. In this paper, we address the problem of loop parallelization and propose a framework where the loop parallelization process is divided into three phases and the optimization of loops is performed via a cyclic application of these three phases. The class of each phase is hierarchically derived from intermediate representations of loops. This facilitates the portability of the resulting parallelizing compilers. Furthermore, one of the phases uses a reservation table of hardware resources in order to obtain optimized parallel programs for given hardware resources. The validation of the proposed framework is given through the application of the object oriented design on an example program which is then parallelized efficiently. 相似文献
14.
15.
过程繁衍及其实现方法 总被引:3,自引:2,他引:1
过程的处理在并行化编译工具中是十分关键的问题,过程嵌入和跨过程信息传播是常用的解决方法.近年来,兼有前二者优点的新技术:过程繁衍(Cloning),逐渐受到人们的重视.而以往的研究中,过程繁衍仅局限于常数值的传播.本文提出了在过程繁衍中进行符号等式约束信息传播的方法,该方法可以增强系统中全局的符号分析(SymbolicAnalysis)能力,并可与一些新技术(如Omega测试)互相配合,从而提高并行化系统的能力.该方法在作者开发的并行化编译工具AFT中得到了实现.对于PerfectBenchmark的测试 相似文献
16.
17.
18.
19.
传统的分布存储并行编译系统大多是在共享存储并行编译系统的基础上开发的.共享存储并行编译系统的并行识别技术适合OpenMP代码生成,实现方式是将所有嵌套循环都按照相同的识别方法进行处理,用于分布存储并行编译系统必然会导致无法高效发掘程序的并行性.分布存储并行编译系统应根据嵌套循环结构的特点进行分类处理,提出适合MPI代码生成的并行识别技术.为解决上述问题,根据嵌套循环的结构和MPI并行程序的特点,提出了一种新的嵌套循环分类方法,并针对不同的嵌套循环分别提出了相应的并行识别技术.实验结果表明,与采用传统并行识别技术的分布存储并行编译系统相比,按照所提方法对嵌套循环进行分类,采用相应并行识别技术的编译系统能够更高效地识别基准程序中的并行循环,自动生成的MPI并行代码其性能加速比提高了20%以上. 相似文献