期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Unified Interprocedural Parallelism Detection

Jay P. Hoeflinger Yunheung Paek Kwang Yi 《International journal of parallel programming》2001,29(2):185-215

相似文献

2.

周静曾国荪《小型微型计算机系统》2007,28(11):1932-1936

传统的并行编译技术能够在编译期间进行相关性分析,有效地并行化循环程序,但是对于程序运行时潜在的并行性却无能为力.因此,并行编译技术必须使用实时依赖分析技术,尽可能挖掘循环级并行性.本文提出仿射依赖关系,消除了循环迭代依赖;基于投机并行思想,提出了SPAD方法.实例分析表明,SPAD是有效的.与LRPD和SPNT方法相比较,SPAD做了重要的改进,因此是更通用的投机并行化方案. 相似文献

3.

Minimal data dependence abstractions for loop transformations: Extended version

Yi-Qing Yang Corinne Ancourt François Irigoin 《International journal of parallel programming》1995,23(4):359-388

Many abstractions of program dependences have already been proposed, such as the Dependence Distance, the Dependence Direction Vector, the Dependence Level or the Dependence Cone. These different abstractions have different precisions. Theminimal abstraction associated to a transformation is the abstraction that contains the minimal amount of information necessary to decide when such a transformation is legal. Minimal abstractions for loop reordering and unimodular transformations are presented. As an example, the dependence cone, which approximates dependences by a convex cone of the dependence distance vectors, is the minimal abstraction for unimodular transformations. It also contains enough information for legally applying all loop reordering transformations and finding the same set of valid mono- and multi-dimensional linear schedules as the dependence distance set. 相似文献

4.

A Polynomial-Time Dependence Test for Determining Integer-Valued Solutions in Multi-Dimensional Arrays Under Variable Bounds

Weng-Long?Chang Chih-Ping?Chu Email author Jia-Hwa?Wu 《The Journal of supercomputing》2005,31(2):111-135

相似文献

5.

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies

Sylvain Girbal Nicolas Vasilache Cédric Bastoul Albert Cohen David Parello Marc Sigler Olivier Temam 《International journal of parallel programming》2006,34(3):261-317

Modern compilers are responsible for translating the idealistic operational semantics of the source program into a form that makes efficient use of a highly complex heterogeneous machine. Since optimization problems are associated with huge and unstructured search spaces, this combinational task is poorly achieved in general, resulting in weak scalability and disappointing sustained performance. We address this challenge by working on the program representation itself, using a semi-automatic optimization approach to demonstrate that current compilers offen suffer from unnecessary constraints and intricacies that can be avoided in a semantically richer transformation framework. Technically, the purpose of this paper is threefold: (1) to show that syntactic code representations close to the operational semantics lead to rigid phase ordering and cumbersome expression of architecture-aware loop transformations, (2) to illustrate how complex transformation sequences may be needed to achieve significant performance benefits, (3) to facilitate the automatic search for program transformation sequences, improving on classical polyhedral representations to better support operation research strategies in a simpler, structured search space. The proposed framework relies on a unified polyhedral representation of loops and statements, using normalization rules to allow flexible and expressive transformation sequencing. Thisrepresentation allows to extend the scalability of polyhedral dependence analysis, and to delay the (automatic) legality checks until the end of a transformation sequence. Our work leverages on algorithmic advances in polyhedral code generation and has been implemented in a modern research compiler. 相似文献

6.

一种基于非正规域的区域依赖关系分析法 总被引：1，自引：0，他引：1

朱根江谢立《计算机学报》1994,17(3):168-175

在自动并行编译中，并行性的识别主要集中在循环及语句级，而许多程序实际上可通过挖掘子程序级这种“任务“并行性来提高性能。本文提出了基于非正规域的区域依赖分析方法，旨在发掘这类并行性，它能精确地刻划程序中的数据访问区域。克服了现有区域分析技术中趋于保守的弱点，从而提出了并行度，依赖关系的测试算法简单而有效。相似文献

7.

Exploitation of parallelism to nested loops with dependence cycles

Weng-Long Chih-Ping Michael 《Journal of Systems Architecture》2004,50(12):729-742

In this paper, we analyze the recurrences from the breakability of the dependence links formed in general multi-statements in a nested loop. The major findings include: (1) A sin k variable renaming technique, which can reposition an undesired anti-dependence and/or output-dependence link, is capable of breaking an anti-dependence and/or output-dependence link. (2) For recurrences connected by only true dependences, a dynamic dependence concept and the derived technique are powerful in terms of parallelism exploitation. (3) By the employment of global dependence testing, link-breaking strategy, Tarjan’s depth-first search algorithm, and a topological sorting, an algorithm for resolving a general multi-statement recurrence in a nested loop is proposed. Experiments with benchmark cited from Vector loops showed that among 134 subroutines tested, 3 had their parallelism exploitation amended by our proposed method. That is, our offered algorithm increased the rate of parallelism exploitation of Vector loops by approximately 2.24%. 相似文献

8.

Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs

Alain Darte Frédéric Vivien 《International journal of parallel programming》1997,25(6):447-496

This paper presents an optimal algorithm for detecting line or medium grain parallelism in nested loops whose dependences are described by an approximation of distance vectors by polyhedra. In particular, this algorithm is optimal for the classical approximation by direction sectors. This result generalizes, to the case of several statements. Wolf and Lam's algorithm which is optimal for a single statement. Our algorithm relies on a dependence uniformization process and on parallelization techniques related to system of uniform recurrence equations. It can also be viewed as a combination of both Allen and Kennedy's algorithm and Wolf and Lam's algorithm. 相似文献

9.

NUAPC: A Parallelizing Compiler for C++

下载免费PDF全文

Zhu Genjiang Xie Li Sun Zhongxiu 《计算机科学技术学报》1997,12(5):458-469

is paper presents a model for automatically parallelizing compiler based on C which consists of compile-time and run-time parallelizing facilities.The paper also describes a method for finding both intra-object and inter-object parallelism.The parallelism detection is completely transparent to users. 相似文献

10.

Segmented Alignment: An Enhanced Model to Align Data Parallel Programs of HPF

Hwang Gwan-Hwan Chen Cheng-Wei Lee Jenq Kuen Dz-Ching Ju Roy 《The Journal of supercomputing》2003,25(1):17-41

In this paper, we propose a new automatic data alignment model called segmented alignment. The conventional data alignment model, such as that used in High-Performance Fortran (HPF), aligns arrays with the whole index domain. The principle of our proposed segmented alignment is to allow alignment relations within delimited index domains. We first provide motivating examples to illustrate how code fragments of HPF with EOSHIFT or CSHIFT operations, or produced by synthesis operations can benefit from our enhanced alignment scheme. Second, we show that this new model can be implemented in HPF-like languages by adding WHEN and IN constructs to them. In addition, we show that the new proposed schemes for WHEN and IN constructs can be emulated using standard HPF syntax. Finally, we address issues related to automatic data alignment for the new proposed model, and present an algorithm to automatically align programs using our segmented alignment scheme. Since the optimal algorithm to do this is NP-hard, a practical heuristic is also given. Our experiments were performed on a DEC Alpha Farm with HPF environments. Our experiments confirm our theory that our proposed alignment scheme can significantly enhance not only the performance of HPF code fragments with EOSHIFT or CSHIFT operations, but also that of codes produced by synthesis operations. 相似文献

11.

Banerjee-GCD与Banerjee-Bound联合数组相关性测试

马国凯朱嘉华张远芳朱传琪《计算机学报》2002,25(2):181-188

以Banerjee-GCD方法和Banerjee-Bound方法为基础，充分考虑了两者的测试结果之间的相互影响以及程序并行化对相关性测试的要求，从而提出了一个在统一的框架下利用Banerjee-GCD方法与Banerjee-Bound方法对不同的相关向量进行测试的联合数组相关性测试方法，该方法在保持执行时间效率的前提下提高了测试的精确性和结果的有效性，并且能够处理一部分非线性下标表达式的情况。相似文献

12.

递归子程序的依赖性分析及其应用 总被引：10，自引：0，他引：10

徐宝文张挺陈振强《计算机学报》2001,24(11):1178-1184

程序依赖性是一种重要的程序分析、理解与维护方法,广泛应用于软件工程及软件逆向工程的各个方面,但递归子程序间的依赖分析一直是依赖性分析中的难点。为此,该文提出了一种新的递归子程序间的依赖性分析方法,它首先分析子程序内部的各种依赖关系;然后,结合子程序调用图分析子程序参数间的依赖关系;最后,通过模拟递归子程序的执行过程来分析它们之间的依赖关系。利用该文提供的方法可得到比较精确的递归子程序间的依赖关系。相似文献

13.

并行化技术与工具

金国华陈福接《计算机研究与发展》1996,33(7):481-492

程序并行化工具由它能有效地解决了多种并行机结构间的代码可移植性和大大地减轻用户使用并行机的困难，已成为当今并行处理领域的一个热门研究课题。相信随着对并行机系统越来越广泛的使用。它还将会得到不断的发展和完善。本文着重介绍了并行化关键技术和工具系统的研究历史与现状，并就这一研究课题今后的发展趋势提出一些看法。相似文献

14.

基于位宽控制提高SIMD架构并行度的优化算法 总被引：1，自引：0，他引：1

张为华朱嘉华张宏江臧斌宇《计算机学报》2009,32(11)

随着SIMD功能单元作为多媒体加速部件的广泛应用,如何有效利用这一构架优化应用程序成为编译优化研究的热点.目前典型的SIMD结构为同一操作对不同的数据化宽提供了不同的指令版本,随着操作数位宽的增加,对应的SIMD指令可同时完成的操作个数也随之降低.因此,如何有效识别操作数的有效位宽,对提高优化过程中SIMD指令内操作的并行度将产生至关重要的影响.文中针对SIMD优化面临的并行度问题,提出了一种优化算法,该算法在对操作数的有效位进行分析的基础上,进行溢出控制,从而减少操作数对宽位宽数据类型的依赖.实验数据表明,该算法可以有效提高多媒体程序优化的并行度,对多媒体程序获得较好的加速效果. 相似文献

15.

函数级数据依赖图及其在静态脆弱性分析中的应用

陈千程凯郑尧文朱红松孙利民《软件学报》2020,31(11):3421-3435

数据流分析是二进制程序分析的重要手段,但传统数据依赖图(DDG)构建的时间与空间复杂度较高,限制了可分析代码的规模.提出了函数级数据依赖图(FDDG)的概念,并设计了函数级数据依赖图的构建方法.在考虑函数参数及参数间相互依赖关系的基础上,将函数作为整体分析,忽略函数内部的具体实现,显著缩小了数据依赖图规模,降低了数据依赖图生成的时空复杂度.实验结果表明,与开源工具angr中的DDG生成方法相比,FDDG的生成时间性能普遍提升了3个数量级.同时,将FDDG应用于嵌入式二进制固件脆弱性分析,实现了嵌入式固件脆弱性分析原型系统FFVA,在对D-Link、NETGEAR、EasyN、uniview等品牌的设备固件分析中,发现了24个漏洞,其中14个属于未知漏洞,进一步验证了FDDG在静态脆弱性分析中的有效性. 相似文献

16.

Data Dependence Analysis of Assembly Code

Wolfram Amme Peter Braun François Thomasset Eberhard Zehendner 《International journal of parallel programming》2000,28(5):431-467

Determination of data dependences is a task typically performed with high-level language source code in today's optimizing and parallelizing compilers. Very little work has been done in the field of data dependence analysis on assembly language code, but this area will be of growing importance, e.g., for increasing instruction-level parallelism. A central element of a data dependence analysis in this case is a method for memory reference disambiguation which decides whether two memory operations may access (or definitely access) the same memory location. In this paper we describe a new approach for the determination of data dependences in assembly code. Our method is based on a sophisticated algorithm for symbolic value propagation, and it can derive value-based dependences between memory operations instead of just address-based dependences. We have integrated our method into the Salto system for assembly language optimization. Experimental results show that our approach greatly improves the precision of the dependence analysis in many cases. 相似文献

17.

程序自动并行化工具FAX

郭克榕唐新春曾丽芳《计算机工程与应用》1999,(9)

该文介绍了大规模并行处理系统程序自动并行化工具FAX（FortranAutomatedXlator）的系统概况。重点阐述了FAX中所采用的先进技术。测试结果表明,FAX已具备一定的可用性及有效性,作为面向分布主存并行机系统的程序自动并行化工具,基本达到了设计目标。相似文献

18.

Dynamic resolution: A runtime technique for the parallelization of modifications to directed acyclic graphs

Lorenz Huelsbergen 《International journal of parallel programming》1997,25(5):385-417

相似文献

19.

一种包含异常传播的类间数据依赖分析方法

张艳梅姜淑娟袁冠《微计算机信息》2010,(6)

类间数据依赖分析是类间数据流测试的基础。本文通过分析类簇级测试中的异常传播对程序数据依赖的影响,提出一种包括异常结构在内的类间C++程序数据依赖分析方法,根据类间关系增量式地构造类间数据依赖图,并给出类间数据依赖图的构造算法。最后,在程序切片中应用了该数据依赖分析方法。结果证明,该方法通过分析异常传播对数据依赖的影响能够带来切片精度的提高。相似文献

20.

基于Q学习的复杂程序动态依赖性分析

王玉宝史亮徐宝文《计算机与数字工程》2005,33(2):9-11,20

准确刻画不确定环境中复杂程序的动态特性，是传统程序分析理论和技术面临的难点，更是许多重要系统实现程序动态实时分析与控制过程亟待解决的问题。本文提出基于Q学习的复杂程序动态分析思想，构造了基于Q学习的复杂程序动态依赖性分析基本算法。并对复杂程序动态实时分析与控制技术的实现环节进行了研究，探讨了相关问题，从而使复杂程序动态分析与控制过程更精炼、更智能、更高效。相似文献