期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

张胜昌《电脑编程技巧与维护》2011,(24):50+64-50,64

通过观察应用程序运行过程中数据查询响应速度过慢的情况,运用程序调试器对相关程序代码进行分析,找出影响查询响应速度的程序代码,进行程序优化,消除代码及查询语句中不合理设计,从而进一步提高了信息系统的运行效率. 相似文献

2.

ARM程序设计优化策略与技术 总被引：6，自引：2，他引：4

刘侃张永泰刘洛琨《单片机与嵌入式系统应用》2004,(4):70-72

程序优化是指软件编程结束后,利用软件开发工具对程序进行调整和改进,让程序充分利用资源,提高运行效率,缩减代码尺寸的过程。按照优化的侧重点不同,程序优化可分为运行速度优化和代码尺寸优化。运行速度优化是指在充分掌握软硬件特性的基础上,通过应用程序结构调整等手段来降低完成指定任务所需执行的指令数。相似文献

3.

基于链接器的RISC-V字加载指令优化

乌鑫龙廖春玉《计算机系统应用》2022,31(9):24-30

RISC-V作为精简指令集的代表, 也会反映一些精简指令集的弊端, 程序体积偏大就是其中之一. 在精简指令集(RISC)中, 实现一些复杂操作所需要的指令条数普遍会多于复杂指令集(CISC), 进而导致最后生成的二进制程序体积相较CISC程序更大. 并且嵌入式设备的RAM和ROM普遍较小, 因此在嵌入式场景中, 程序的体积变得尤为重要. 为了在现有压缩指令集的基础上尽可能的优化RISC-V程序代码体积, RISC-V指令集子扩展Zce制定了一系列指令. 其中以LWGP为代表的一系列指令被用来减少加载/存储字节数据时的指令条数. 本文分析了以LWGP为代表的指令对于代码体积的优化原理并且将之实现在LLD链接器上, 通过分析使用LWGP等指令前后程序体积的变化评估对于二进制程序体积优化的效率并且提出后续改进建议. 相似文献

4.

VLIW处理器循环指令缓冲器设计与实现

李勇胡慧俐杨焕荣《计算机应用》2014,34(4):1005-1009

数字信号处理软件中循环程序在执行时间上占有很大比例,用指令缓冲器暂存循环代码可以减少程序存储器的访问次数,提高处理器性能。在VLIW处理器指令流水线中增加一个支持循环指令的缓冲器,该缓冲器能够缓存循环程序指令,并以软件流水的形式向功能部件派发循环程序指令。这样循环程序代码只需访存一次而执行多次,大大减少了访存次数。在循环指令运行期间,缓冲器发出信号使程序存储器进入睡眠状态可以降低处理器功耗。典型的应用程序测试表明,使用了循环缓冲后,取指流水线空闲率可达90%以上,处理器整体性能提高10%左右,而循环缓冲的硬件面积开销大约占取指流水线的9%。相似文献

5.

动态二进制翻译中的冗余LOAD删除优化技术

王丽一文延华《计算机应用与软件》2008,25(6):40-43

动态二进制翻译系统是根据程序的动态执行信息来将源机器上的可执行代码翻译成目标机器上的可执行代码.在翻译成中间表示的过程中会产生一些冗余的LOAD指令,为提高代码的执行效率,提出对这些LOAD指令进行冗余删除优化.该优化技术可以使优化效果超过其自身的开销,达到优化的目的. 相似文献

6.

基于CK810处理器的汇编链接时优化

胡敏卢永江刘兵《计算机工程》2014,(11):250-254

提出基于CK810处理器的16/32位混编指令集汇编链接时优化技术。利用汇编输出二进制文件,根据CK810处理器的16/32位混编指令集中指令及操作数的特征,动态选择指令的编码方式,实现对指令relax,最大程度地提高了程序的代码密度。对于在汇编时不能确定编码方式的指令,通过留出重定位的方式,由链接时完成优化。在链接时,利用信息的确定性,实现对整个程序的压缩和指令的替换,使得程序执行效率更高,代码占用空间更小。汇编链接时优化技术克服了传统编译器只限于一个模块优化的缺点,把优化范围扩展到整个程序,实现了跨模块的优化,使得基于CK810处理器的程序代码密度平均提高7.52%,性能平均提升7.91%。相似文献

7.

嵌入式系统源程序级软件能耗建模与分析

叶珊郭荣佐黄君《计算机应用研究》2017,34(10)

针对嵌入式系统能耗对各种嵌入式设备工作时长的影响,本文从系统指令级到源程序级的软件能耗考虑,首先通过分析设备源程序级语句的相关特征,基于源程序语句的指令能耗,提出一种针对源程序级的能耗模型,然后基于模型分析对五个经典算法的源程序中不同类别语句进行能耗优化,最后分别对五组经典算法优化前后的能耗比较。实验表明,本模型使得优化后的源程序能耗降低了9.46%-50.29%,达到了降低嵌入式系统软件能耗的目的。相似文献

8.

动态二进制翻译中动态优化的成本与收益分析

孙光辉王丽娟《计算机时代》2010,(2):4-5

传统的静态编译器优化存在着各种限制,为此,提出了一种运行期动态优化的对策。在程序的执行过程中,持续检测程序运行的profile信息,并根据这些信息对程序代码进行优化变换,创建并运行程序代码的优化版本。这种运行期动态优化操作是直接针对程序的二进制代码的,不针对程序语言或编译器。这不仅带来优化的透明性,还使得老版本的源代码即遗留代码也可以从优化技术中获得性能提升。相似文献

9.

基于TCG技术的二进制翻译条件转移指令优化研究

张家豪单征岳峰傅立国王军李明亮《计算机工程与科学》2019,41(8):1343-1352

在二进制翻译中引入TCG中间表示技术可以实现多目标平台之间的程序移植,同时可以更加方便地引入新型平台,解决新平台对主流平台的兼容性问题。然而由于原有的中间表示在翻译过程中影响了代码的关联度,生成的后端代码中存在较多冗余指令,影响翻译程序的执行效率。分析了指令优化可行性,针对条件跳转指令进行优化,通过指令预处理对中间表示进行改进,实现中间表示到后端代码生成由一对多翻译模式到多对多翻译模式的转变,采用指令归约技术,针对条件跳转指令的2种模式CMP-JX型与TEST-JX型,分别设计相应的优化翻译算法,并在开源二进制平台QEMU上实现。基于NPB-3.3和SPEC CPU 2006测试集进行了测试,与以前的翻译模式进行对比,优化后的代码膨胀率平均减少了14.62%,翻译程序运行速度提升了17.23%,验证了该优化方法的有效性。相似文献

10.

基于跨基本块变换和循环分布的SLP优化技术

索维毅赵荣彩姚远张小妹《计算机科学》2013,40(10):24-28,60

现有的SLP优化算法无法处理内层循环中存在的依赖环和归约,并且在基本块边界产生大量的冗余拆包和赋值语句,从而导致向量化效率不高.针对该问题,提出了一种基于跨基本块变换和循环分布的SLP优化算法.该算法以控制流图为基础,根据基本块间各数组变量的Define-Use关系以及跨越基本块之间的数据依赖关系进行跨基本块的向量化变换,有序地采用跨基本块变换和循环分布,尽可能发掘最内层循环基本块内语句的并行性,使SLP自动向量化编译器生成具有更多SIMD指令的向量化代码.实验结果表明,该算法能够隐藏更多跨基本块冗余操作的开销,同时利用跨基本决的数据依较生成更优的SIMD指令,有效地提高了向量化程序的加速比. 相似文献

11.

Structured FORTRAN preprocessors generating optimized output

Tatsuo Tsuji Katsumasa Watanabe Atsushi Ikehata 《Software》1988,18(5):427-442

Usually a structured FORTRAN program is transformed into a standard FORTRAN program by means of a one-pass preprocessor. In this case, several problems are caused by the many redundant continue and goto statements that a preprocessor generates. These problems include: (i) the generated FORTRAN program is not easy to read; (ii) the size of files to store the related programs becomes large; (iii) the total time increases for executing both the FORTRAN compiler and the output program itself. This paper presents a new scheme for constructing a one-pass preprocessor that generates the optimized FORTRAN code by suppressing the redundant statements. By employing this scheme we have constructed our own preprocessors for Westran (one of the Structured FORTRAN languages) and Ratfor and measured these against the traditional preprocessors for the Westran and Ratfor languages. One of the results is as follows. The total time for users, namely ‘preprocessing time + FORTRAN compiling time’ is rather less than the traditional ones. This is due to the fact that to suppress the input/output of the redundant statements from/to files contributes to the decreasing of both preprocessing and compiling times, and this compensates completely the time-overhead of the optimization process. 相似文献

12.

一种加速访存地址计算的编译优化

高秀武姜军白书敬黄亮明《计算机工程》2023,49(1):173-180

在国产申威高性能多核服务器系统中,基础编译系统对应用程序中访存操作进行代码生成时,没有考虑国产处理器指令特征,导致编译器生成的访存地址计算代码效率较低,影响国产高性能处理器的性能。为充分发挥国产处理器高性能计算能力,提出一种加速访存地址计算的编译优化方法。加速访存地址计算编译优化基于处理器支持带扩展因子的运算指令,在编译器后端内存地址表达式合法性检查中,添加针对乘加模式的地址计算表达式合法性检查算法,自动识别地址表达式中存在的乘加运算并进行合法性检验,对符合条件的地址表达式在代码生成阶段匹配生成带扩展因子的运算指令来快速计算访存地址,从而加快访存指令的发射与执行以及应用程序中的访存地址生成,提升访存效率。使用行业标准性能测试集SPEC CPU2006对优化效果进行评测,结果表明,相比优化前SPECspeed Integer与SPECspeed Float Point两个子集,该优化方法平均性能分别提高了2.53%与1.50%。相似文献

13.

控制膨胀的程序结构化

张远芳马国凯朱嘉华朱传琪《计算机工程与科学》2002,24(1):100-102

目前人们通常单纯用增加临时变量和相关判断的方法，或者使用共享代码拷贝的方法来消除goto语句，但前一种方法会造成判定增加，语义分析困难，而后一种虽然使转变后的程序结构清晰，却造成了benchmark中的某些程序急剧膨胀，针对上述问题，本文提出了能控制膨胀的代码拷贝算法，且该算法还能有效地处理不可规约的程序。相似文献

14.

Design and implementation of a queue compiler

Arquimedes Canedo Ben A. Abderazek Masahiro Sowa 《Microprocessors and Microsystems》2009,33(2):129-138

Queue processors are a viable alternative for high performance embedded computing and parallel processing. We present the design and implementation of a compiler for a queue-based processor. Instructions of a queue processor implicitly reference their operands making the programs free of false dependencies. Compiling for a queue machine differs from traditional compilation methods for register machines. The queue compiler is responsible for scheduling the program in level-order manner to expose natural parallelism and calculating instructions relative offset values to access their operands. This paper describes the phases and data structures used in the queue compiler to compile C programs into assembly code for the QueueCore, an embedded queue processor. Experimental results demonstrate that our compiler produces good code in terms of parallelism and code size when compared to code produced by a traditional compiler for a RISC processor. 相似文献

15.

BPF的实现机制分析与性能优化研究

下载免费PDF全文

曾鸣赵荣彩《计算机工程》2007,33(12):43-45,4

BSD包过滤器(BSD Packet Filter，BPF)是BSD Unix操作系统提供的网络数据包捕获及过滤机制的内核组件。该文描述了BPF的组成及工作过程，分析了BPF采用的无环控制流图过滤模式，介绍了此模式基于虚拟机的实现。为了提高过滤器性能，必须解决BPF虚拟机指令生成器处理多个过滤条件组合时存在的指令冗余问题，通过引入静态单赋值(SSA)，结合冗余谓词消除和窥孔优化等技术，可以有效缩短CFG图的平均路径长度，从而实现对过滤器性能的优化。相似文献

16.

程序自动并行化中的数组终写关系分析 总被引：1，自引：0，他引：1

下载免费PDF全文

罗勇张平龚雪容《计算机工程》2008,34(16):95-97

在程序自动并行化中过程中,数据收集阶段可能产生冗余通信,该文利用数组终写关系分析的方法来消除冗余通信,实现嵌套循环中数组数据最后写关系的快速求解,并将结果提供给编译器后端,生成精确数据收集代码。描述数组终写关系的研究目的和内容,将所处理的嵌套循环根据其结构特征进行分类,给出实现算法的过程。测试结果证明了该算法的正确性和高效性,所产生的精确数据收集代码能够有效地消除部分冗余通信,从而优化和提高了并行化程序的性能。相似文献

17.

A program logic for resources

David Aspinall Lennart Beringer Martin Hofmann Hans-Wolfgang Loidl Alberto Momigliano 《Theoretical computer science》2007

We introduce a reasoning infrastructure for proving statements about resource consumption in a fragment of the Java Virtual Machine Language (JVML). The infrastructure is based on a small hierarchy of program logics, with increasing levels of abstraction: at the top there is a type system for a high-level language that encodes resource consumption. The infrastructure is designed to be used in a proof-carrying code (PCC) scenario, where mobile programs can be equipped with formal evidence that they have predictable resource behaviour. 相似文献

18.

Application of redundant computation in program debugging

Zakarya A. Alzamil Author Vitae 《Journal of Systems and Software》2008,81(11):2024-2033

Programmers spend most of their time and resources in localizing program defects. On the other hand, they commit many errors by manipulating dynamic data improperly, which may produce dynamic memory problems, such as dangling pointer, memory leaks, and inaccessible objects. Dangling pointers can occur when a function returns a pointer to an automatic variable, or when trying to access a deleted object. Inaccessible objects occur when a pointer is assigned to point to another object, leaving the original object inaccessible, either by using the new operator or regular assignment operator. Memory leaks occur when a dynamic data is allocated but never de-allocated. The existence of such dynamic memory problems causes the programs to behave incorrectly. Improper usage of dynamic data is a common defect that is easy to commit, but is difficult to diagnose and discover. In this paper, we propose a dynamic approach that detects different types of program defects including those that occur as a result of misusing the dynamic data in computer programs. Our approach uses the notion of redundant computation to identify the suspicious locations in the program that may contain defects. Redundant computation is an execution of a program statement(s) that does not contribute to the program output. The notion of redundant computation is introduced as a potential indicator of defects in programs. We investigate the application of redundant computation in debugging programs. The detection of redundant computation indicates deficiency that may represent a bug in the program. The results of the experiment show that, the redundant computation detection can help the debuggers to localize the source(s) of the program defects. 相似文献