首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
通用处理器的寄存器分配一般采用图着色的方法.除非考虑特例,优化的图着色是NP完全性问题.因此,传统寄存器分配常利用图着色的启发式算法,并能对规则的RISC处理器生成质量较高的代码.但由于嵌入式处理器不规则的体系结构特征,这种传统寄存器分配方法生成的代码质量不能满足嵌入式领域的要求.本文提出了一种新的遗传算法和局部搜索相混合的元启发式方法,能较好地克服传统寄存器分配的不足.实验结果表明,这种新的算法比传统图着色寄存器分配算法减少约30%spill代码.  相似文献   

2.
BWDSP是一款自主设计的国产VLIW(超长指令字)数字信号处理器,支持SIMD技术,其SIMD指令可以在4个宏上同时执行4个32位计算,对寄存器使用有特殊规则,Open64编译器的寄存器分配策略并不适用于这种规则.本文对BWDSP SIMD指令的寄存器分配优化技术进行了研究,并在BWDSP的编译器OCC上得以实现.  相似文献   

3.
介绍了一种用于嵌入式处理器Si02的高级语言编译器的设计与实现方法。提出了处理器Si02特有的寄存器分配方法——循环栈机制,并给出了编译器关键技术中的一些算法,简化了嵌入式编译器的实现过程。  相似文献   

4.
寄存器分配技术是编译器最为关键的优化技术之一.反馈式编译优化是一种基于程序当前和以前运行时的趋势来改变程序以后执行动作的技术,它能够提供给寄存器分配一些有用的优化信息.在分析Open64编译器反馈式编译优化技术的基础上,基于ALPHA结构实现和扩展了反馈式编译优化在寄存器分配中的应用,获得了较好的优化性能.  相似文献   

5.
多寄存器组网络处理器上的寄存器分配技术   总被引:1,自引:0,他引:1  
针对传统的图着色寄存器分配算法不能直接处理网络处理器的操作问题,提出了一种多寄存器组网络处理上的寄存器分配技术.在依次分析了一个符号寄存器可能位于哪些寄存器组?如果没有候选组,该如何解决这种冲突?如果有多个候选组,该选用哪个组等问题的基础上,通过将这些方法与图着色寄存器分配算法相融合,在IXP上实现了这种多寄存器组的寄存器分配,提高了它的可编程性.这种方法也可运用到其它具有类似寄存器结构的处理器上.  相似文献   

6.
SpMV的自动性能优化实现技术及其应用研究   总被引:1,自引:0,他引:1  
在科学计算中,稀疏矩阵向量乘(SpMV)是一个十分重要且经常被大量调用的计算内核.由于SpMV一般实现算法的浮点计算和存储访问次数比率非常低,且其存储访问模式极为不规则,其实际运行性能往往很低.通过采用寄存器分块算法和启发式分块大小选择算法,将稀疏矩阵分成小的稠密分块,重用保存在寄存器中向量x元素,可以提高该计算内核的性能.剖析和总结了OSKI软件包所采用的若干关键优化技术,并进行了实际应用性能测试.测试表明,在实际应用这些优化技术的过程中,应用程序对SpMV的调用次数要达到上百次的量级,才能抵消由于应用这些性能优化技术所带来的额外时间开销,取得性能加速效果.在Pentium 4和AMD Athlon平台上,测试了10个矩阵,其平均加速比分别达到了1.69和1.48.  相似文献   

7.
本文基于斯坦福大学设计的KernelC编译器ISCD,针对64位流处理器体系结构,设计实现了其核心VLIW编译器,并针对高性能计算应用需求进行优化,实现了分布式寄存器负载均衡和指令自动合并技术。实验结果表明,该编译器能够很好地开发程序中的并行性,具有较高的效率。  相似文献   

8.
提出了一种可用于Java处理器的改进型寄存器队列(FIFO)的设计和控制方法。通过在传统的指针移动型FIFO的基础上,改变读写指针的操作宽度,增加读出端口,增加旁路没计等方法,使得改进型寄存器队列可以适应Java语言字节指令的变长特性。该设计在一种针对嵌入式系统的Java虚拟机的硬件实现中得到应用,提高了Java处理的取指效率,片对随后的指令折叠提供了方便。  相似文献   

9.
10.
专用处理器,如DSP等,因主要支持特定应用,其指令集往往只支持有限的数据类型。在采用高级语言为其编程时,若采用了处理器不支持的奇异数据类型,编译器必须在保持语义的前提下将其转化为处理器支持的一段指令。该文提出了一种在VLIW DSP编译器中实现对奇异数据类型的处理的方法,包括对含有奇异数据类型的中间代码的注释、调度依赖关系的计算、寄存器分配的改进。该类方法对编译器的改动相对较小,效率较高。  相似文献   

11.
David R. Hanson 《Software》1983,13(8):745-763
Program optimization has received a great deal of attention for many years, which has resulted in numerous advances in compiler technology. The effectiveness of various simple optimizations has received comparably little attention during the same time period. The simplicity of most programs suggests that straightforward optimizations pay the greatest dividends. This paper describes three such optimizations suitable for one-pass compilers. The optimizations involve expression rearrangement, instruction selection, and the use of a cache for the allocation of resources. The cost of these optimizations is low; none require major changes to the size or structure of the compiler or reduce compilation speed by more than 10%. The benefits are high; each optimization results in at least a 10% average reduction in object code size and a corresponding reduction in execution time. Examples and implementation details are also described.  相似文献   

12.
The explosive growth in network bandwidth and Internet services such as QoS (quality of service) and SLA (service level agreement) monitoring have created the need for new networking hardware called a Network Processing Unit (NPU). In order to rapidly reconfigure the NPU for frequently varying Internet services and technologies, a high-performance C compiler is urgently needed. Several code generation techniques, which are intended to meet the high code quality demands of other types of application specific instruction-set processors (ASIPs) like digital signal processors (DSPs), have already been developed. However, these techniques are insufficient for NPUs due to striking architectural differences such as asymmetric data paths. The main purpose of this paper is to discuss our recent experience with the development of a commercial compiler for a new NPU called the Paion PPII, which is basically a packet engine for NPU to meet the growing need for new high-bandwidth communication equipment targeted for Internet routers and ethernet adapters. For this purpose, we will first show the architectural challenges posed by the target NPU. Then, we will describe several compiler techniques that we found to be effective for the target NPU with various unorthogonal architectural features. The current implementations of the PPII use a VLIW (Very Long Instruction Word) architecture. So, we handled this VLIW-style architecture by employing a simple code compaction scheme which packs multiple parallel instructions into one long instruction word. The experimental results show that our techniques are effective for significantly reducing the dynamic instruction count. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

13.
Optimizing compilers increase the resulting code performance by carrying out a number of code optimization techniques. Profile information assistance for code optimizations gives an opportunity to greatly increase the code performance in some cases. However, the impossibility to provide a representative training execution often leads to the decline in efficiency of profile-dependent code optimizations. This paper investigates the main causes of the performance loss for the one-stage optimization as compared to the profileguided optimization (PGO) and introduces some alternative compilation techniques to reduce this loss. The effectiveness of these techniques is evaluated for a VLIW-architecture Elbrus compiler.  相似文献   

14.
混合优化策略统一结构的探讨   总被引:9,自引:1,他引:9       下载免费PDF全文
算法混合已成为提高优化性能和效率的一个重要而有效的途径。围绕meta-heuristic算法,通过对混合优化算法及其结构的归类与综述,提出了混合优化算法的一种统一结构,并对基若干问题进行分析探讨,为混合算法的设计与应用提供一定的指导性原则。  相似文献   

15.
Instruction-level parallel processing: History,overview, and perspective   总被引:11,自引:0,他引:11  
Instruction-level parallelism (ILP) is a family of processor and compiler design techniques that speed up execution by causing individual machine operations to execute in parallel. Although ILP has appeared in the highest performance uniprocessors for the past 30 years, the 1980s saw it become a much more significant force in computer design. Several systems were built and sold commercially, which pushed ILP far beyond where it had been before, both in terms of the amount of ILP offered and in the central role ILP played in the design of the system. By the end of the decade, advanced microprocessor design at all major CPU manufacturers had incorporated ILP, and new techniques for ILP had become a popular topic at academic conferences. This article provides an overview and historical perspective of the field of ILP and its development over the past three decades.  相似文献   

16.
提出将线性扫描算法用于传输触发体系结构(TTA)编译器的后端优化设计中,实现全局寄存器分配。线性扫描算法的应用使TTA编译器具有生成目标代码质量高、算法的时间和空间复杂度低、易于实现等优点。实验结果表明,该算法在寄存器数目相同,且有大量的变量竞争时具有明显优势。  相似文献   

17.
查那日苏  何立强  魏凤歧 《计算机工程》2010,36(11):256-258,261
提出基于热扩散模型的测试程序分类方法,根据峰值温度的高低对SPEC CPU2000的测试程序进行分类。讨论测试程序的热扩散特征和程序行为之间的对应关系。实验结果表明基于热扩散模型分类是一种有效的程序分类方法,其分类结果为不同类型多线程工作负载的组合提供了参考。  相似文献   

18.
Most of the current cloud computing providers allocate virtual machine instances to their users through fixed-price allocation mechanisms. We argue that combinatorial auction-based allocation mechanisms are especially efficient over the fixed-price mechanisms since the virtual machine instances are assigned to users having the highest valuation. We formulate the problem of virtual machine allocation in clouds as a combinatorial auction problem and propose two mechanisms to solve it. The proposed mechanisms are extensions of two existing combinatorial auction mechanisms. We perform extensive simulation experiments to compare the two proposed combinatorial auction-based mechanisms with the currently used fixed-price allocation mechanism. Our experiments reveal that the combinatorial auction-based mechanisms can significantly improve the allocation efficiency while generating higher revenue for the cloud providers.  相似文献   

19.
税控收款机嵌入式系统的设计与实现   总被引:2,自引:0,他引:2  
嵌入式系统的应用领域越来越广泛。嵌入式系统的开发首先要选取合适的微处理器与操作系统;文章根据税控收款机对软硬件的要求,选取S3C44B0X微处理器与NucleusPLUS操作系统,介绍了税控收款机系统的设计与实现。  相似文献   

20.
提出了很多结合技术使得指令调度与寄存器分配之间进行一些信息交互,在没有引入过多溢出代码的情况下提高了指令级并行度,从而提高了性能。按照算法的特征分类介绍了几种影响力较大的算法,同时作了简单的评价和效果比较,最后介绍了有关指令调度和寄存器分配结合的一些新方向。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号