期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李文龙汤志忠《计算机科学》2004,31(3):163-166

循环并行化是并行编译的核心问题之一。许多科学计算程序的大部分执行时间花费在循环上，有效开发循环中的并行性将提高整个程序的执行效率。多重循环最为常见，因此并行化多重循环具有重要的理论和现实意义。现代处理器中硬件资源迅速增长，也使得在整个多维循环空间中开发并行性成为必要。目前大多数软件流水算法只对最内层循环，仅有少数的算法对多重循环进行软件流水，本文介绍几种多重循环的软件流水算法，比较它们之间的相似与不同之处，为编译器实现中算法的选择提供了指导。相似文献

2.

RISC多发射结构中循环优化的软件流水算法 总被引：2，自引：0，他引：2

罗玉华李三立《计算机学报》1993,16(9):692-700

软件流水技术是一种很有效的循环优化方法。本文综述了软件流水的基本思想和算法分类;详细描述了三个典型的算法——LAM的算法,完善流水法和增强流水调度法;从时间优化效益,空间效益和计算复杂度等方面对它们作了分析比较;最后,对软件流水技术作了评价。相似文献

3.

带条件分支的指令级循环优化新方法

汤志忠王剑《软件学报》1995,6(1):148-156

分解式软件流水ＤＥＳＰ是我们最近提出来的一种对无分支循环进行有效调度的新方法，它通过把循环地分解为两个子问题，把无分支调度问题转化为无环路图的调度，从而运用图论中一些经典的复杂度为多项式的方法来解决。在本中，我们把ＤＥＳＰ方法扩展成可以优化带条件分支的循环，称为全局分解式软件流水方法－ＧＤＥＳＰ。相似文献

4.

RISC多发射结构中编译优化的软件流水算法

金燕平周颖《计算机工程》1996,22(1):66-72

软件流水技术是ＲＩＳＣ多发射结构编译器的常用优化技术，它是一种很有效的循环化优化方法，文中介绍了软件流水的基本思想和算法分类，并对三种限家性软件流水算法－ＵＲＰＲ算法，Ｌａｍ的算法和ＤＥＳＰ算法进行了详细描述和分析比较，得出了有意义的结论。相似文献

5.

软件流水中的一种数据分配算法

罗军汤志忠张赤红于涛《软件学报》1998,9(1):74-79

数据元素的存储器分配是指令级并行优化编译过程中不可回避的一个关键性问题．该问题解决得好坏直接关系到编译优化的效率．本文第１节主要介绍ＩＬＳＰ（ｉｎｔｅｒｌａｃｅｄｉｎｎｅｒａｎｄｏｕｔｅｒｌｏｏｐｓｏｆｔｗａｒｅｐｉｐｅｌｉｎｉｎｇ）算法的基本思想．第２节以距阵乘法为例阐述了在ＩＬＳＰ算法下多重循环中数据元素的存取特点．第３节则从理论上对该特点进行了深入的分析研究，同时就一般多重循环给出了一个行之有效的ＩＬＳＰ算法下数据元素内存分配算法．第４节给出一个实验比较结果．最后是结论. 相似文献

6.

一种支持多重循环软件流水的寄存器结构 总被引：1，自引：0，他引：1

容红波汤志忠《软件学报》2000,11(3):401-409

寄存器结构及其分配是软件流水算法的关键之一.为支持多重循环的软件流水,该文提出一种新颖的寄存器结构：半共享跳跃式流水寄存器堆.它可以有效地解决多重循环软件流水下的特殊问题,即：同层次和跨层次的寄存器重命名问题以及断流问题;有效地消除外层循环的体间读写相关,提高程序的指令级并行度.它有3种分配方式可供灵活使用：单个寄存器、流水寄存器和寄存器组方式.流水寄存器方式对生存期确定的、局限于一个循环层次的寄存器重命名问题提供简单而有效的支持.寄存器组分配方式解决了多重循环软件流水时变量生存期不确定的情况.跳跃操作为相似文献

7.

GURPR-一种新的全局软件流水方法

苏伯珙王剑《计算机工程与应用》1990,(Z1)

软件流水技术是对微程序及程序中循环进行优化的有效手段,现已从传统的对只含基本块的循环进行优化的技术发展到全局软件流水技术,但现有的全局软件流水方法的三项主要技术指标即优化的时间效益及空间效益和计算复杂性不能令人满意。为此本文提出一种新的全局软件流水技术GURPR,介绍其基本思想及主要算法,通过分析及实例可表明GURPR法的全面性能优于其它全局软件流水方法。相似文献

8.

一个能够高效地运行带有分支的循环的MIMD体系结构

赵巍汤志忠《计算机工程与应用》1992,(8):41-44

本文提出了一个新型的能够高效率地支持带有分支的循环运行的MIMD体系结构。这一体系结构在软件流水技术的支持下,能够灵活地处理循环中的分支对循环并行执行所产生的不利影响。从而在运行循环时,在时间效益及空间效益上都达到极优。本文在介绍体系结构之后,还将介绍其优化编译器的初步构造。相似文献

9.

多重循环的软件流水技术 总被引：1，自引：1，他引：0

下载免费PDF全文

汤志忠王雷钱江《软件学报》1996,7(7):422-427

为了解决多重循环的指令级并行编译问题，本文提出了反刍方法，以一种新的思维方式处理多重循环，将其视为一个程序流整体，有效地开发了多重循环的并行度．另外，本文还给出了实现反刍方法的基本步骤以及相应的硬件支持．最后，通过一些初步实验的结果验证了本算法的有效性，并讨论了其时间和空间效益，分析了其主要特点. 相似文献

10.

分解式软件流水DESP：一种开发循环程序指令级并行性的新方法

汤志忠王剑《软件学报》1995,6(1):138-147

本在软件流水方面提出一种新观点，把软件流水看作是一种指令级变形，是把一维指令向量变换成二维指令矩阵。这样，软流流水问题可以很自然地分解为两个子问题：一个是确定每个操作在指令矩阵中的行号，另一个是确定其在指令矩阵中的列号。其中这种观战我们开发出一种新的循环调度方法，叫做分解式软件流水－ＤＥＳＰ。相似文献

11.

软件流水中的一种数据调度算法

罗军汤志忠张赤红《软件学报》1998,9(6):474-480

文章第1节对软件流水下多重循环中数据元素的调度进行了分析,着重讨论了用地址计数器完成简单地址运算的意义、ILSP(interlaced inner and outer loop software pipelining)算法的基本思想及其在此基础上分析了软件流水下多重循环中数据元素的调度特点；第2节进一步探讨了为完成调度而寻找地址控制信息序列的一般方法；第3、4节则分别讨论了用求得的地址控制信息序列控制地址计数器对数据元素的访问和将地址控制信息序列化简为精简地址控制信息序列的步骤；最后两节分别是实验结果和结相似文献

12.

循环体间相关问题及改进的URPR软件流水方法

苏伯珙王剑《计算机学报》1992,15(7):499-506

本文首先在理论上分析了循环体间相关对软件流水的影响.提出了一个由循环本身性质决定的充分必要条件并证明了满足此条件的循环是可限制的,否则是不可限制的;其次我们证明了任意不可限制的循环展开K次后即可转换为可限制循环,K取决于循环本身的性质;最后给出了循环预处理算法和一个新的循环体压缩算法.实验结果表明,这两个算法可使URPR算法对任意循环都能得到最优时间效益并保持了良好的空间效益及低的计算复杂性. 相似文献

13.

Time Optimal Software Pipelining of Loops with Control Flows

Han-Saem Yun Jihong Kim Soo-Mook Moon 《International journal of parallel programming》2003,31(5):339-391

Software pipelining is widely used as a compiler optimization technique to achieve high performance in machines that exploit instruction-level parallelism. However, surprisingly, there have been few theoretical or empirical results on time optimal software pipelining of loops with control flows. In this paper, we present three new theoretical and practical contributions for this underinvestigated problem. First, we propose a necessary and sufficient condition for a loop with control flows to have an optimally software-pipelined program. We also present a decision procedure to compute the condition. As part of the formal treatment of software pipelining, we propose a new formalization of software pipelining. Second, we present two software pipelining algorithms. The first algorithm computes an optimal solution for every loop satisfying the condition, but may run in exponential time. The second algorithm computes optimal solutions efficiently for most (but not all) loops satisfying the condition. The former one proves the sufficiency of the condition and the latter one suggests a practical optimal software pipelining algorithm. Third, we present experimental results which strongly indicate that achieving the time optimality in the software-pipelined programs is a viable goal in practice with reasonable hardware support. 相似文献

14.

一种IA-64下的反软件流水算法

下载免费PDF全文

汪淼赵荣彩蔡国明《计算机工程与应用》2007,43(23):58-60

软件流水是一种循环程序的优化技术,它可以有效地提高指令级并行性。由于处理机的实现方法各不相同,在一种处理机上经过软件流水优化后的循环代码很难在其它处理机中移植和使用。反软件流水是软件流水的逆向操作,它可以消除循环代码中的软件流水特性,以便于代码在不同平台上的移植。基于IA-64体系结构,分析了软件流水的代码特点,提出了反流水算法,用于将ICC编译器编译后的可执行二进制代码消除软件流水特性,转换成语义等价的C代码。相似文献

15.

Decomposed software pipelining: A new perspective and a new approach 总被引：1，自引：0，他引：1

Jian Wang Christine Eisenbeis Martin Jourdan Bogong Su 《International journal of parallel programming》1994,22(3):351-373

Software pipelining is an efficient instruction-level loop scheduling technique, but existing software pipelining approaches have not been widely used in practical and commercial compilers. This is mainly because resource constraints and the cyclic data dependencies make software pipelining very complicated and difficult to apply. In this paper we present a new perspective on software pipelining in which it is decomposed into two subproblems—one is free from cyclic data dependencies and can be effectively solved by the list scheduling technique, and the other is free from resource constraints and can be easily solved by classical polynomial-time algorithms of graph theory. Based on this new perspective, we develop a new instruction-level loop scheduling approach, call DEcomposed Software Pipelining (DESP). 相似文献

16.

Processor Array Synthesis from Shift-Variant Deep Nested Do Loops

Kittitornkun Surin Hu Yu Hen 《The Journal of supercomputing》2003,24(3):229-249

The consolidation of Internet devices into a universal/portable device will soon be accomplishable through the incorporation of reconfigurable computing in system-on-a-chip (SOC). At any particular moment, it could be a video/audio mobile phone, an MP3 song player, and other devices. The basic construct of these multimedia processing algorithms can be described as deep nested Do loop algorithms. They are considered the most demanding data-intensive algorithms and hence ideal candidates for an array of reconfigurable nanoprocessors. Therefore, algorithm to hardware synthesis methodology is important for an efficient exploitation of both spatial parallelism and temporal pipelining. In this paper, we propose a processor array synthesis methodology. It can map an n-level nested Do loop represented by a nonuniform or shift-variant data dependence graph to a near-optimal of one-or two-dimensional processor array under the available resource constraints to satisfy high-throughput computation demands. 相似文献

17.

IA-64二进制翻译的软件流水消除技术

下载免费PDF全文

崔雪冰张俊峰崔平非《计算机工程》2010,36(11):88-89,92

在逆向工程中,软件流水循环为逆向翻译带来了困难。针对如何在IA-64二进制翻译中处理软件流水循环提出一种解决方案,采用直接语义映射算法,并通过实验验证该算法在二进制翻译中处理软件流水代码的有效性,为在IA-64二进制翻译中处理软件流水代码奠定了基础。相似文献

18.

Trace Software Pipelining

下载免费PDF全文

Wang Jian Andreas Krall M.Anton Ertl 《计算机科学技术学报》1995,10(6):481-490

Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches.This paper presents a novel global software pipelining technique,called Trace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruction Word (VLIW) and superscalar machines.Trace software pipelining applies a global code scheduling technique to compact the original loop body.The resulting loop is called a trace software pipelined (TSP) code.The trace softwrae pipelined code can be directly executed with special architectural support or can be transformed into a globally software pipelined loop for the current VLIW and superscalar processors.Thus,exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique.This makes our new technique very promising in practical compilers.Finally,we also present the preliminary experimental results to support our new approach. 相似文献

19.

Novel Neighborhood Search for Multiprocessor Scheduling with Pipelining

《Journal of Parallel and Distributed Computing》2002,62(1):85-110

This paper presents a neighborhood search algorithm for heterogeneous multiprocessor scheduling in which loop pipelining is used to exploit parallelism between iterations. The method adopts a realistic model for interprocessor communication where resource contention is taken into consideration. The schedule representation scheme is flexible so that communication scheduling can be performed in a generic manner. Base on a general time formulation of the schedule performance, the algorithm improves an initial schedule in an efficient way by successive modification to the task processor mapping and task ordering. Simulation results show that significant improvement over existing methods can be obtained. A parallel software video encoder was implemented based on the scheduling result and real time performance was achieved with pipelining of frame encoding. 相似文献

20.

A Pipelining Loop Optimization Method for Dataflow Architecture

下载免费PDF全文

Xu Tan Xiao-Chun Ye Xiao-Wei Shen Yuan-Chao Xu Da Wang Lunkai Zhang Wen-Ming Li Dong-Rui Fan Zhi-Min Tang 《计算机科学技术学报》2018,33(1):116-130

与 exascale 来超级计算的时代,电源效率成为了最重要的障碍造一个 exascale 系统。Dataflow 建筑学在为科学应用完成高电源效率有本国的优点。然而,最先进的 dataflow 体系结构没能为循环处理利用高并行。处理这个问题,我们建议一个 pipelining 环优化方法(PLO ) ,它在处理元素(PE ) 在环流动做重复 dataflow 的数组加速器。这个方法由二种技术,帮助建筑学的硬件重复和帮助说明的软件重复组成。在硬件重复执行模型,一个在薄片上循环控制器被设计产生循环索引,减少计算内核并且打为 pipelining 执行的一个好基础的复杂性。在软件重复实行模型,另外的环指令被论述解决重复相关性问题。经由这二种技术,准备好了每周期执行的指令的平均数字被增加使浮点联合起来忙。当这二种技术的硬件费用是可接受的时,模拟结果证明分别地,我们的建议方法平均由 2.45x 和 1.1x 在浮点效率超过静电干扰和动态循环执行模型。相似文献