共查询到17条相似文献,搜索用时 62 毫秒
1.
软件流水是循环调度的重要方法.有分支循环的流水依然是个难题.现有算法可以分为4类:循环线性化、路径分离、整体调度和路径选择.它们都未能和谐地解决两个对立问题:转移时间最小化和最差约束问题.提出了基于路径分组和数据相关松弛的软件流水框架,试图无矛盾地解决上述问题.其主要思想是:(1)路径分组,即按照路径的执行概率和转移概率将路径分组,力求最小化转移时间;(2)数据相关松弛,力求避免最差约束,即当循环有多条路径时,有些相关在循环执行中并不一定有实例,理想的策略是仅当它有实例时才遵守.初步实验和定性分析表明,此 相似文献
2.
循环是程序中的热代码,对循环进行有效的优化可以显著缩短程序的执行时间。软件流水是一种开发循环体指令级并行的细粒度循环优化技术,它通过调度循环中连续迭代之间的指令使其并行执行,从而提高了循环的执行效率。实验数据表明,用Cerngoop程序包进行测试,循环优化效果明显。 相似文献
3.
4.
5.
一种支持多重循环软件流水的寄存器结构 总被引:1,自引:0,他引:1
寄存器结构及其分配是软件流水算法的关键之一.为支持多重循环的软件流水,该文提出一种新颖的寄存器结构:半共享跳跃式流水寄存器堆.它可以有效地解决多重循环软件流水下的特殊问题,即:同层次和跨层次的寄存器重命名问题以及断流问题;有效地消除外层循环的体间读写相关,提高程序的指令级并行度.它有3种分配方式可供灵活使用:单个寄存器、流水寄存器和寄存器组方式.流水寄存器方式对生存期确定的、局限于一个循环层次的寄存器重命名问题提供简单而有效的支持.寄存器组分配方式解决了多重循环软件流水时变量生存期不确定的情况.跳跃操作为 相似文献
6.
本在软件流水方面提出一种新观点,把软件流水看作是一种指令级变形,是把一维指令向量变换成二维指令矩阵。这样,软流流水问题可以很自然地分解为两个子问题:一个是确定每个操作在指令矩阵中的行号,另一个是确定其在指令矩阵中的列号。其中这种观战我们开发出一种新的循环调度方法,叫做分解式软件流水-DESP。 相似文献
7.
8.
本文首先在理论上分析了循环体间相关对软件流水的影响.提出了一个由循环本身性质决定的充分必要条件并证明了满足此条件的循环是可限制的,否则是不可限制的;其次我们证明了任意不可限制的循环展开K次后即可转换为可限制循环,K取决于循环本身的性质;最后给出了循环预处理算法和一个新的循环体压缩算法.实验结果表明,这两个算法可使URPR算法对任意循环都能得到最优时间效益并保持了良好的空间效益及低的计算复杂性. 相似文献
9.
10.
11.
体差不等式测试为适用于软件流水领域的高维数组数据相关性分析算法,通过使用更严格的限制条件,该算法可以获得比传统数据相关性分析方法更精确的结果。文章展开该算法的实验研究。实验表明,虽然体差不等式测试算法只能针对实可行解域进行数据相关性分析,但所得到的结果仍然与实际情况相吻合。对于科学计算循环中出现的大多数数据相关性判定问题,使用体差不等式测试算法可以获得很好的效果。 相似文献
12.
Han-Saem Yun Jihong Kim Soo-Mook Moon 《International journal of parallel programming》2003,31(5):339-391
Software pipelining is widely used as a compiler optimization technique to achieve high performance in machines that exploit instruction-level parallelism. However, surprisingly, there have been few theoretical or empirical results on time optimal software pipelining of loops with control flows. In this paper, we present three new theoretical and practical contributions for this underinvestigated problem. First, we propose a necessary and sufficient condition for a loop with control flows to have an optimally software-pipelined program. We also present a decision procedure to compute the condition. As part of the formal treatment of software pipelining, we propose a new formalization of software pipelining. Second, we present two software pipelining algorithms. The first algorithm computes an optimal solution for every loop satisfying the condition, but may run in exponential time. The second algorithm computes optimal solutions efficiently for most (but not all) loops satisfying the condition. The former one proves the sufficiency of the condition and the latter one suggests a practical optimal software pipelining algorithm. Third, we present experimental results which strongly indicate that achieving the time optimality in the software-pipelined programs is a viable goal in practice with reasonable hardware support. 相似文献
13.
Specification of software pipelining using petri nets 总被引:1,自引:0,他引:1
This paper presents a flexible model for software pipelining using the petri nets. Our technique, called the Petri Net Pacemaker
(PNP), can create near optimal pipelines with less algorithmic effort than other techniques. The pacemaker is a novel idea
which exploits the cyclic behavior of petri nets to model the problem of scheduling operations of a loop body for software
pipelining. A way of improving the performance of loops containing predicates is given. The PNP technique also shows how nested
loops can be pipelined. A comparison with some of the other techniques is presented.
THis work was partially supported by the National Science Foundation under grants CDA-9100788 and CDA-9200371. 相似文献
14.
软件流水是一种重要的指令调度技术,它通过同时执行来自不同循环迭代的指令来加快循环的执行时间.随着处理器速度和访存速度差距越拉越大,访存指令尤其是cache miss的访存指令日益成为系统性能提高的瓶颈.由于这些指令的延迟不是固定的,如何在软件流水中预测并掩盖这些访存指令的延迟是非常重要的.与前人预测访存延迟的方法不同,引入cache profiling技术,通过动态收集到profile信息来预测访存延迟,并进行适当的调度.当增加模调度循环中的访存指令的延迟时,启动间隔也会随之增大,导致性能不会随之上升.CSMS算法和FLMS算法在尽量不增大启动间隔的情况下,改变访存指令的延迟.改进了CSMS算法和FLMS算法,根据cache profiling的信息来改变访存延迟,所以比前人的方法更为准确.实验表明,新方法可以有效地提高程序性能,对SPEC2000测试程序平均性能提高1%左右,个别例子的性能改进高达11%. 相似文献
15.
16.
R. Govindarajan N. S. S. Narasimha Rao E. R. Altman Guang R. Gao 《International journal of parallel programming》2000,28(1):1-46
Instruction scheduling methods which use the concepts developed by the classical pipeline theory have been proposed for architectures involving deeply pipelined function units. These methods rely on the construction of state diagrams (or automatons) to (i) efficiently represent the complex resource usage pattern; and (ii) analyze legal initiation sequences, i.e., those which do not cause a structural hazard. In this paper, we propose a state-diagram based approach for modulo scheduling or software pipelining, an instruction scheduling method for loops. Our approach adapts the classical pipeline theory for modulo scheduling, and, hence, the resulting theory is called Modulo-Scheduled pipeline (MS-pipeline) theory. The state diagram, called the Modulo-Scheduled (MS) state diagram is helpful in identifying legal initiation or latency sequences, that improve the number of instructions initiated in a pipeline. An efficient method, called Co-scheduling, which uses the legal initiation sequences as guidelines for constructing software pipelined schedules has been proposed in this paper. However, the complexity of the constructed MS-state diagram limits the usefulness of our Co-scheduling method. Further analysis of the MS-pipeline theory, reveals that the space complexity of the MS-state diagram can be significantly reduced by identifying primary paths. We develop the underlying theory to establish that the reduced MS-state diagram consisting only of primary paths is complete; i.e., it retains all the useful information represented by the original state diagram as far as scheduling of operations is concerned. Our experiments show that the number of paths in the reduced state diagram is significantly lower—by 1 to 3 orders of magnitude—compared to the number of paths in the original state diagram. The reduction in the state diagram facilitate the Co-scheduling method to consider multiple initiations sequences, and hence obtain more efficient schedules. We call the resulting method, enhanced Co-scheduling. The enhanced Co-scheduling method produced efficient schedules when tested on a set of 1153 benchmark loops. Further the schedules produced by this method are significantly better than those produced by Huff's Slack Scheduling method, a competitive software pipelining method, in terms of both the initiation interval of the schedules and the time taken to construct them. 相似文献
17.
Jian Wang Christine Eisenbeis Martin Jourdan Bogong Su 《International journal of parallel programming》1994,22(3):351-373
Software pipelining is an efficient instruction-level loop scheduling technique, but existing software pipelining approaches
have not been widely used in practical and commercial compilers. This is mainly because resource constraints and the cyclic
data dependencies make software pipelining very complicated and difficult to apply. In this paper we present a new perspective
on software pipelining in which it is decomposed into two subproblems—one is free from cyclic data dependencies and can be
effectively solved by the list scheduling technique, and the other is free from resource constraints and can be easily solved
by classical polynomial-time algorithms of graph theory. Based on this new perspective, we develop a new instruction-level
loop scheduling approach, call DEcomposed Software Pipelining (DESP). 相似文献