首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 140 毫秒
1.
容红波  汤志忠 《软件学报》2000,11(5):646-653
流相关是影响VLIW(very long instruction word)结构上的循环调度的一个关键因素.目前的研究未利用VLIW的锁步特性.利用这一性质,围绕着包含这一概念,该文为VLIW结构上的流相关分析提出了一个完整的娄学模型,发现体间流相关集合可划分为若干不相交的线序集合,存在且仅存在一个独立的、全包含的流相关集合(基),使其他所有流相关都不必要.该模型允许多周期操作和条件分支.该结果可作为研究VLIW的数学基础,也可用于工程实践.  相似文献   

2.
桂忠艳  杨静  谢志强 《控制与决策》2017,32(11):1921-1932
针对柔性作业车间调度中工序间存在的冗余调度次序约束关系问题和工序-设备间存在的多加工模式情况,提出基于剪枝分层的柔性加工车间调度算法.该算法首先用有向无环图表示工序及工序间的调度次序关系,采用剪枝法消除图中的冗余弧,采用分层法对图中结点分层;其次对加工模式进行分类,制定工序-设备预约策略和工序-设备预分配策略;最后,采用事件驱动策略,驱动时刻按所提出的柔性加工策略调度工序加工.理论分析和实例表明,所提出的算法具有较好的调度效果.  相似文献   

3.
超长指令字(Very Long Instruction Word,VLIW)处理器一般采用总线互连的多簇结构,每个簇中的功能单元共享一个本地寄存器堆,簇间采用总线传输数据,以避免功能单元增多时,全连通结构的延时、面积和功耗的快速增长;但簇间数据共享时的拷贝和延时,使得处理器在性能上有所下降.文中提出了一种寄存器堆互连的多簇VLIW结构,采用寄存器堆来连接各个簇,从而可以避免簇间数据传输的延时和额外的数据拷贝操作.同时也提出了针对这种结构的指令调度算法,以提高指令调度的性能.实验结果表明,与全连通的VLIW结构相比,寄存器堆互连结构在性能上仅有13%左右的性能下降,代码长度则基本不变;这都优于总线互连的多簇结构.  相似文献   

4.
XML数据库模式规范化设计是产生一组相关联的、能表示数据间依赖关系、而且消除了冗余的XML模式或DTD,其目的是为了互联网上的信息检索能够避免异常. XML数据库模式中某些数据依赖的存在是冗余存在的原因,因此在XML数据库模式中数据依赖与冗余的关联是其规范化设计研究的关键问题. XML数据库模式的数据依赖包括属性间数据依赖和元素间的数据依赖,与之关联的结构冗余和不规则是指在XML模式的结构上存在冗余和不规则的情况.给出综合了属性间和元素间数据依赖的XML数据库模式数据依赖的定义,分析与之关联的结构冗余以及不规则的情况,基于此定义消除结构冗余和不规则的规范的XML模式森林作为范式,给出并验证其规范化设计算法.  相似文献   

5.
分支调度是一种有效消除分支指令延迟的指令调度技术,对于提升VLIW类处理器的性能非常重要。提出了一个针对分支延迟槽的指令调度优化算法。该算法面向VLIW体系结构,根据程序依赖图选择合适的候选指令序列;通过建立代价收益模型为分支延迟槽产生一个收益较大的指令调度序列。实验数据表明,分支调度算法可以平均提升12.9%的应用程序性能。  相似文献   

6.
基于GCC的VLIW编译系统研究   总被引:1,自引:1,他引:0  
VLIW机器在单个机器周期中同时发射并执行多个的并行操作,从而获得较高的指令级并行度,这些操作之间的依赖分析和调度工作则被完全交给相应的编译器执行,因此VLIW的并行性能能否充分发挥取决于VLIW体系结构相关编译器的质量。GNU开发的GCC是被最广泛使用的编译系统之一,它具有多语言、多平台支持的能力和开放的结构,能够运用各种成熟的常规编译优化技术生成高效的代码。文章分析了VLIW及GCC的结构特点,提出了一种基于GCC的VLIW编译系统设计方案,利用GCC进行RTL中间代码一级的体系结构无关优化和少量体系结构相关优化,在汇编代码一级针对VLIW结构进行体系结构相关的优化,从而充分利用GCC的成熟编译技术快速开发高效的VLIW多语言编译系统。  相似文献   

7.
基于RSVP和WFQ实现多媒体同步的模型   总被引:2,自引:0,他引:2  
同步是在因特网中实现多媒体实时通信的关键问题。现有的网络元素(NE)的调度机制不能保持相关流的流间同步关系。结合RSVP和WFQ调度机制提出一个多媒体实时通信的模型方案,在同一端到端路径为多个相关流预留资源,沿路各网络元素对这些相关流进行相关调度,保持相关流数据单元间稳定的同步关系,从而大大减少目的端相关流数据单元的时差和抖动,方便目的端进行同步处理。  相似文献   

8.
针对传统并行多路传输中数据调度算法存在的问题,基于MPTCP协议,提出了带宽预测和前向时延的数据调度算法(data-scheduling algorithm using bandwidth estimation and forward trip-time,DA-BEFT)。该算法充分考虑子流间传输时延差较大的影响,结合性能好的重传选路策略,减轻接收端因数据乱序导致的缓存阻塞,提高整个连接吞吐量。通过仿真实验验证了DA-BEFT在子流时延差变化时能够提高带宽利用率,提高网络的吞吐量。  相似文献   

9.
研究石油价格变化趋势问题,石油价格是多种影响因素的综合结果,具有复杂非线性,影响因子间信息存在大量冗余,传统预测方法无法消除冗余信息和难以准确描述石油价格非线性变化规律,预测精度低.为了提高石油价格预测精度,提出一种将主成份分析和支持向量机相结合的石油价格预测方法.首先利用主成分分析对石油价格影响因子进行处理,消除因素间的冗余信息,降低支持向量机的输入变量维数,然后利用支持向量机对保留主成分进行建模和预测.模型对石油价格数据进行仿真实验,结果表明,模型消除冗余信息,加快学习速度,提高了石油价格预测精度,并能为石油价格预测提供有效的方法.  相似文献   

10.
XML模式中隐式冗余不存在的充要条件   总被引:1,自引:0,他引:1  
XML数据库模式规范化设计是产生一组相关联的、能表示数据间依赖关系、而且消除了冗余的XML模式或DTD,以更好地进行信息检索.XML数据库模式中某些数据依赖的存在是冗余存在的原因,因此在XML数据库模式中数据依赖与冗余的关联是其规范化设计研究的关键问题,但对这一问题目前还没有专门的研究.XML数据库模式的数据依赖包括属性间数据依赖和元素间的数据依赖,给出综合了属性间和元素间数据依赖的XML数据库模式数据依赖的定义,分析与之关联的隐式冗余,并论证XML模式中隐式冗余不存在当且仅当该XML模式是规范的,为XML数据库模式规范化设计更深一层的研究奠定理论基础.  相似文献   

11.
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) dependence vectors. Parallelizing and partitioning of nested loops requires efficient inter-iteration dependence analysis. Although many methods exist for nested loop partitioning, most of these perform poorly when parallelizing nested loops with irregular dependences. Unlike the case of nested loops with uniform dependences these will have a complicated dependence pattern which forms a non-uniform dependence vector set. We apply the results of classical convex theory and principles of linear programming to iteration spaces and show the correspondence between minimum dependence distance computation and iteration space tiling. Cross-iteration dependences are analyzed by forming an Integer Dependence Convex Hull (IDCH). Every integer point in this IDCH corresponds to a dependence vector in the iteration space of the nested loops. A simple way to compute minimum dependence distances from the dependence distance vectors of the extreme points of the IDCH is presented. Using these minimum dependence distances the iteration space can be tiled. Iterations within a tile can be executed in parallel and the different tiles can then be executed with proper synchronization. We demonstrate that our technique gives much better speedup and extracts more parallelism than the existing techniques.  相似文献   

12.
Precise value-based data dependence analysis for scalars is useful for advanced compiler optimizations. The new method presented here for flow and output dependence uses Factored Use and Def chains (FUD chains), our interpretation and extension of Static Single Assignment. It is precise with respect to conditional control flow and dependence vectors. Our method detects dependences which are independent with respect to arbitrary loop nesting, as well as loop-carried dependences. A loop-carried dependence is further classified as being carried from the previous iteration, with distance 1, or from any previous iteration, with direction <. This precision cannot be achieved by traditional analysis, such as dominator information or reaching definitions. To compute anti- and input dependence, we use Factored Redef-Use chains, which are related to FUD chains. We are not aware of any prior work which explicitly deals with scalar data dependence utilizing a sparse graph representation. A preliminary version of this paper appeared in theSeventh Anual Workshop on Languages and Compilers for Parallel Computing, August 1994. Supported in part by NSF Grant CCR-9113885 and a grant from Intel Corporation and the Oregon Advanced Computing Institute.  相似文献   

13.
The El'brus-3 and MARS-M represent two recent efforts to address the Soviet Union's high-performance computing needs through original, indigenous development. The El'brus-3 extends very long instruction word (VLIW) concepts to a multiprocessor environment and offers features that increase performance and efficiency and decrease code size for both scientific and general-purpose applications. It incorporates procedure static and globally dynamic instruction scheduling, multiple, simultaneous branch path execution, and iteration frames for executing loops with recurrences and conditional branches. The MARS-M integrates VLIW, data flow, decoupled heterogeneous processors, and hierarchical systems into a unified framework. It also offers a combination of static and dynamic VLIW scheduling. While the viability of these machines has been demonstrated, significant barriers to their production and use remain.This paper was written nearly entirely by means of e-mail between Tucson and Novosibirsk. It is one of the first examples of this type of collaboration between Russian and American colleagues.  相似文献   

14.
Data dependence uniformization, a method for overcoming the difficulties in parallelizing a doubly nested loop with irregular dependence constraints is proposed. This approach is based on the concept of vector decomposition. A simple set of basic dependences is developed from which all dependence constraints can be composed. The set of basic dependences is added to every iteration to replace all original dependences so that the dependence constraints become uniform. An efficient synchronization method is presented to obey the uniform dependence constraints in every iteration  相似文献   

15.
现有OpenMP调度策略通常采用动态策略处理程序中的线性循环结构,存在负载不均衡和调度开销大的问题。提出一种针对线性递增或线性递减循环结构的非线性静态调度策略Nonlinear_static。将线性循环负载均匀变化参数与总负载、负载峰值、线程数相结合构建调度模型,计算循环迭代在线程上的映射,使迭代块大小呈非线性递增或递减趋势。将线性循环的负载平均地分配在每个线程上,并在开源OMPi编译器中进行编码。在Adjoint Convolution、Compute Pots、Matrix Multiplication、Mandelbrot Set应用程序上进行多线程调度,实验结果表明,相比静态调度、动态调度、指导调度等策略,Nonlinear_static调度策略在处理线性循环结构时执行时间缩短了5%~10%,且具有无调度开销的优点。  相似文献   

16.
Global software pipelining is a complex but efficient compilation technique to exploit instruction-level parallelism for loops with branches.This paper presents a novel global software pipelining technique,called Trace Software Pipelining,targeted to the instruction-level parallel processors such as Very Long Instruction Word (VLIW) and superscalar machines.Trace software pipelining applies a global code scheduling technique to compact the original loop body.The resulting loop is called a trace software pipelined (TSP) code.The trace softwrae pipelined code can be directly executed with special architectural support or can be transformed into a globally software pipelined loop for the current VLIW and superscalar processors.Thus,exploiting parallelism across all iterations of a loop can be completed through compacting the original loop body with any global code scheduling technique.This makes our new technique very promising in practical compilers.Finally,we also present the preliminary experimental results to support our new approach.  相似文献   

17.
专用处理器,如DSP等,因主要支持特定应用,其指令集往往只支持有限的数据类型。在采用高级语言为其编程时,若采用了处理器不支持的奇异数据类型,编译器必须在保持语义的前提下将其转化为处理器支持的一段指令。该文提出了一种在VLIW DSP编译器中实现对奇异数据类型的处理的方法,包括对含有奇异数据类型的中间代码的注释、调度依赖关系的计算、寄存器分配的改进。该类方法对编译器的改动相对较小,效率较高。  相似文献   

18.
Inter‐iteration dependences in loops can hinder loop‐level parallelism. For some loops, existing thread‐level speculation techniques fail to expose their inherent loop‐level parallelism, because some inter‐iteration dependences are too costly to synchronize, predict, pre‐compute and isolate. This paper presents a compiler technique called loop recreation to change the nature of some dependences (by turning some inter‐iteration dependences into intra‐iteration ones and vice versa) in a loop so that the inter‐iteration dependences in the transformed loop are less costly to enforce at runtime than those in the original loop. We present an algorithm for finding an optimal loop recreation transformation with respect to a simple misspeculation cost model and demonstrate the performance advantages of loop recreation over two recent techniques for multicore systems running nine representative irregular applications. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号