期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dictionary-based program compression on customizable processor architectures

Jari Heikkinen Jarmo Takala Henk Corporaal 《Microprocessors and Microsystems》2009,33(2):139-153

The size of the program code has become a critical design constraint in embedded systems, especially in handheld devices. Large program codes require large memories, which increase the size and cost of the chip. In addition, the power consumption is increased due to higher memory I/O bandwidth. Program compression is one of the most often used methods to reduce the size of the program code. In this paper, dictionary-based program compression is evaluated on a customizable processor architecture with parallel resources. In addition to code density, the effectiveness of the method is evaluated in terms of area and power consumption. Furthermore, a mechanism is proposed to maintain the programmability after compression. Up to 77% reduction in area and 73% reduction in power consumption of the program memory and the associated control logic were obtained. 相似文献

2.

基于代码压缩方法的低功耗嵌入式系统设计技术研究

李曦张来勇熊悦周学海《小型微型计算机系统》2003,24(5):887-890

低功耗是嵌入式系统设计中的重要约束条件之一．代码压缩能够减小程序目标代码尺寸，减小程序目标代码所占用的存储器空间和通信开销，从而在系统级上降低了系统功耗．本文对指令集裁剪压缩技术、全代码压缩与子代码压缩技术以及基于片上Cache的代码压缩技术等几种比较典型的代码压缩技术的特征进行了讨论和分析．相似文献

3.

Variable Length Instruction Compression on Transport Triggered Architectures

Timo Viitanen Janne Helkala Heikki Kultala Pekka Jääskeläinen Jarmo Takala Tommi Zetterman Heikki Berg 《International journal of parallel programming》2018,46(6):1283-1303

The memories used for embedded microprocessor devices consume a large portion of the system’s power. The power dissipation of the instruction memory can be reduced by using code compression methods, which may require the use of variable length instruction formats in the processor. The power-efficient design of variable length instruction fetch and decode is challenging for static multiple-issue processors, which aim for low power consumption on embedded platforms. The memory-side power savings using compression are easily lost on inefficient fetch unit design. We propose an implementation for instruction template-based compression and two instruction fetch alternatives for variable length instruction encoding on transport triggered architecture, a static multiple-issue exposed data path architecture. With applications from the CHStone benchmark suite, the compression approach reaches an average compression ratio of 44% at best. We show that the variable length fetch designs reduce the number of memory accesses and often allow the use of a smaller memory component. The proposed compression scheme reduced the energy consumption of synthesized benchmark processors by 15% and area by 33% on average. 相似文献

4.

EREER: Energy-aware register file and execution unit using exploiting redundancy in GPGPUs

《Microprocessors and Microsystems》2020

Nowadays, the use of GPGPUs is growing in high-performance computing including embedded system. Demanding more processing power increase the size of Register File (RF) and Execution Unit (EU) in GPGPUs, that increase the power consumption. However, energy and power consumption are vital for the embedded system due to using a battery and a simple cooling system. In this paper, initially, we have proposed a simple method to identify duplicated data in RF of GPGPUs. Afterward, we propose a compression method to improve the energy efficiency of RF by eliminating duplicated data and consequently unallocating some of RF banks. Experimental results on standard benchmarks show that our compression method reduces the total RF power consumption by 15% on average by considering overhead degradation. Furthermore, we propose a computation reuse method in the EU to exploit computation redundancy. This method utilizes the compression information of RF to identify the identical computations and turn off the processing cores that execute them. Moreover, our computation reuse method improves the EU energy efficiency by 28.8% on average. 相似文献

5.

On the implementation of bytecode compression for interpreted languages

Ekaterina Stefanov Anthony M. Sloane 《Software》2009,39(2):111-135

This paper describes a new method for code space optimization for interpreted languages called LZW‐CC . The method is based on a well‐known and widely used compression algorithm, LZW , which has been adapted to compress executable program code represented as bytecode. Frequently occurring sequences of bytecode instructions are replaced by shorter encodings for newly generated bytecode instructions. The interpreter for the compressed code is modified to recognize and execute those new instructions. When applied to systems where a copy of the interpreter is supplied with each user program, space is saved not only by compressing the program code but also by automatically removing the unused implementation code from the interpreter. The method's implementation within two compiler systems for the programming languages Haskell and Java is described and implementation issues of interest are presented, notably the recalculations of target jumps and the automated tailoring of the interpreter to program code. Applying LZW‐CC to nhc98 Haskell results in bytecode size reduction by up to 15.23% and executable size reduction by up to 11.9%. Java bytecode is reduced by up to 52%. The impact of compression on execution speed is also discussed; the typical speed penalty for Java programs is between 1.8 and 6.6%, while most compressed Haskell executables run faster than the original. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

6.

Transforming binary code for low-power embedded processors

Petrov P. Orailoglu A. 《Micro, IEEE》2004,24(3):21-33

Two program code transformation methodologies reduce the power consumption of instruction communication buses in embedded processors. Aimed at deep-submicron process technologies, these techniques offer an efficient solution for applications in which low power consumption is the key quality factor. We have developed two techniques for power minimization on the instruction bus of embedded processors. The first is compiler-driven register name adjustment (RNA), with the main goal of power minimization on instruction fetch and register file access. The second technique, more general in nature, incorporates transformations into the binary program code and necessitates hardware support on the processor side to efficiently restore the power-optimized program code. 相似文献

7.

面向高层次综合的自定义指令自动识别方法

肖成龙林军王珊珊王宁《计算机应用》2018,38(7):2024-2031

针对在高层次综合（HLS）过程中性能提升、功耗降低困难等问题,提出了一种面向高层次综合的自定义指令自动识别方法。在高层次综合过程之前实现对自定义指令的枚举和选择,从而为高层次综合提供通用的自定义指令识别方法。首先,将高层次源代码转换为控制数据流图（CDFG）,实现了对源代码的预处理;其次,基于控制数据流图内的数据流图（DFG）,采用子图枚举算法以自底而上的方式枚举出所有连通凸子图,有效提高了用户可灵活修改约束条件的能力;然后,分别从面积、性能和代码量三个角度考虑,利用子图选择算法选择部分最佳子图作为最终的自定义指令;最后,用所选的自定义指令重新生成新代码作为高层次综合工具的输入。与传统高层次综合相比,采用基于出现频率的模式选择可平均减少19.1%的面积,采用基于关键路径的子图选择可平均减少22.3%的时延。此外,与TD算法相比,所提算法的枚举效率平均提升70.8%。实验结果表明,自定义指令自动识别方法使高层次综合在电路设计中能够显著地提升性能,减少面积和代码量。相似文献

8.

Code-carrying theories

Bart Jacobs Sjaak Smetsers Ronny Wichers Schreur 《Formal Aspects of Computing》2007,19(2):191-203

This paper is both a position paper on a particular approach in program correctness, and also a contribution to this area. The approach entails the generation of programs (code) from the executable content of logical theories. This capability already exists within the main theorem provers like Coq, Isabelle and ACL2 and PVS. Here we will focus on issues portraying the use of this methodology, rather than the underlying theory. We illustrate the power of the approach within PVS via two case studies (on unification and compression) that lead to actual running code. We also demonstrate its flexibility by extending the program generation capabilities. This paper fits in a line of ongoing integration of programming and proving. 相似文献

9.

xpTools:代码压缩系统定制环境 总被引：1，自引：0，他引：1

王志刚周学海李曦杨君《小型微型计算机系统》2006,27(7):1250-1253

代码压缩技术通过对整个应用程序全部或者部分指令进行压缩，能够有效地减少内存的尺寸、功耗等。但是，代码压缩技术却没有得到广泛的应用，主要原因是缺乏有效的定制环境．本文针对代码压缩的需求提出了一套可重定向的工具集xpTools用于代码压缩系统的定制，该工具集（包括编译器、仿真器、分析工具、综合工具等）可根据设计者选择的指令集自动生成，利用这些工具设计者能够针对具体应用权衡代码压缩技术对系统的性能、功耗、尺寸等方面的影响，进而定制最佳的代码压缩系统．相似文献

10.

Two versions of architectures for dynamic implied addressing mode

Jonghee M. Youn Minwook Ahn Yunheung Paek Jongwung Kim Jeonghun Cho 《Journal of Systems Architecture》2010,56(8):368-383

The complexity of today’s embedded applications increases with various requirements such as execution time, code size or power consumption. To satisfy these requirements for performance, efficient instruction set design is one of the important issues because an instruction customized for specific applications can make better performance than multiple instructions in aspect of fast execution time, decrease of code size, and low power consumption. Limited encoding space, however, does not allow adding application specific and complex instructions freely to the instruction set architecture. To resolve this problem, conventional architectures increases free space for encoding by trimming excessive bits required beyond the fixed word length. This approach however shows severe weakness in terms of the complexity of compiler, code size and execution time. In this paper, we propose a new instruction encoding scheme based on the dynamic implied addressing mode (DIAM) to resolve limited encoding space and side-effect by trimming. We report our two versions of architectures to support our DIAM-based approach. In the first version, we use a special on-chip memory to store extra encoding information. In the second version, we replace the memory by a small on-chip buffer along with a special instruction. We also suggest a code generation algorithm to fully utilize DIAM. In our experiment, the architecture augmented with DIAM shows about 8% code size reduction and 18% speed up on average, as compared to the basic architecture without DIAM. 相似文献

11.

Studying the code compression design space – A synthesis approach

《Journal of Systems Architecture》2014,60(2):179-193

Embedded domain has witnessed the application of different code compression methodologies on different architectures to bridge the gap between ever-increasing application size and scarce memory resources. Selection of a code compression technique for a target architecture requires a detailed study and analysis of the code compression design space. There are multiple design parameters affecting the space, time, cost and power dimensions. Standard approaches of exploring the code compression design space are tedious, time consuming, and almost impractical with the increasing number of proposed compression algorithms. This is one of the biggest challenges faced by an architect trying to adopt a code compression methodology for a target architecture. We propose a novel synthesis based tool-chain for fast and effective exploration of the code compression design space and for evaluation of the tradeoffs. The tool-chain consists of a frontend framework that works with different compression/decompression schemes and a backend with high-level-synthesis, logic-synthesis, and power estimation tools to output the critical design parameters. We use the tool-chain to effectively analyze different code compression/decompression schemes of varying complexities. 相似文献

12.

Code compression by register operand dependency

《Journal of Systems and Software》2004,72(3):295-304

This paper proposes a dictionary-based code compression technique that maps the source register operands to the nearest occurrence of a destination register in the predecessor instructions. The key idea is that most destination registers have a great possibility to be used as source registers in the following instructions. The dependent registers can be removed from the dictionary if this information can be specified otherwise. Such destination–source relationships are so common that making use of them can result in much better code compression. After removing the dependent register operands, the original dictionary size can be reduced significantly. As a result, the compression ratio can benefit from: (a) the reduction of dictionary size due to the removal of dependent registers, and (b) the reduction of program encoding due to the reduced number of dictionary entries.A set of programs has been compressed using this feature. The compression results show that the average compression ratio is reduced to 38.41% on average for MediaBench benchmarks compiled for MIPS R2000 processor, as opposed to 45% using operand factorization. 相似文献

13.

An on-chip instruction cache design with one-bit tag for low-power embedded systems

Ji Gu^{Author Vitae} Hui Guo Author VitaePatrick LiAuthor Vitae 《Microprocessors and Microsystems》2011,35(4):382-391

On-chip instruction cache is a potential power hungry component in embedded systems due to its large chip area and high access-frequency. Aiming at reducing power consumption of the on-chip cache, we propose a Reduced One-Bit Tag Instruction Cache (ROBTIC), where the cache size is judiciously reduced and the cache tag field only contains the least significant bit of the full-tag. We develop a cache operational control scheme for ROBTIC so that with the one-bit cache tag, the program locality can still be efficiently exploited. For applications where most of the memory accesses are localized, our cache can achieve similar performance as a traditional full-tag cache; however, the power consumption of the cache can be significantly reduced due to the much smaller cache size, narrower tag array (just one bit), and tinier tag comparison circuit being used. Experiments on a set of benchmarks implemented in CMOS 180 nm process technology demonstrate that our proposed design can reduce up to 27.3% dynamic power consumption and 30.9% area of the traditional cache when the cache size is fixed at 32 instructions, which outperforms the existing partial-tag based cache design. With the cache size customization, a further 47.8% power saving can be achieved. Our experimental results also show that when implemented in the deep sub-micron technologies where the leakage power is not ignorable, our design is still efficient - a coherent power saving trend (about 22%) has been observed for technologies from 130 nm down to 65 nm. 相似文献

14.

面向星载计算机的双重索引数据压缩方法

邓岸华乔磊杨孟飞《软件学报》2022,33(10):3844-3857

随着星载计算机系统功能的日益复杂,程序规模也在快速扩大.在存储资源极其受限的背景下,需要稳定、有效的代码压缩功能来保障星载软件的正常存储与运行.混合压缩算法是目前无损数据压缩的主流算法,具有压缩率高、代码规模和计算资源需求大的特点.然而,在航天星载计算机等嵌入式系统中,由于其运行环境特殊,需要较高的可靠性和抗干扰能力,无法实现混合压缩算法应有的效果.同时,单一压缩模型压缩率较低.针对以上问题,在LZ77算法代码体积和内存消耗优势的基础上提出了改进方法为压缩过程设计一种新的匹配记录表以存储高价值数据索引来辅助压缩,实现了原算法局部性优势与高价值数据全局分布的互补,更大程度上减少了数据冗余;结合动态填充、变长编码等进一步优化编码结构,降低存储需求;最终,设计并实现了一种更加适合航天嵌入式环境的无损数据压缩算法(LZRC).实验结果表明:(1)新算法在比LZ77算法代码体积仅多出3.5 KB的条件下,对软件代码的平均压缩比提高了17%;(2)新算法的运行内存需求仅为混合压缩算法的12%,代码体积也减少了84%,更加适合星载计算机系统. 相似文献

15.

嵌入式系统程序优化方法的研究

王丽芳符意德纽远《通讯和计算机》2005,2(4):14-17

嵌入式系统往往对实时性、系统功耗和程序代码长度有特殊的要求，本文从程序设计的角度讨论满足这些要求的程序代码优化方法。文中首先讨论了程序执行时间的优化方法，随后讨论了程序代码长度的优化方法，最后讨论了程序功耗的优化方法。相似文献

16.

GPU平台上面向性能和功耗的分支优化

于齐王博千沈立王志英陈微《计算机科学》2016,43(5):22-26

强大的计算能力使得GPGPU在通用计算领域得到了广泛的应用。然而,GPGPU的SIMT(Single Instruction Multiple Threads)工作方式,使其执行效率受到应用中不一致分支行为(Branch Divergence)的严重影响。虽然人们提出了线程交换方法来减小分支带来的性能损失,但这种方法往往会引入额外的访存操作,不仅在一定程度上减少了线程交换优化的性能收益,还增加了功耗。首先举例说明线程交换范围对程序性能和功耗的影响;然后提出了一种减少线程交换所引入的额外访存操作的方法。实验表明,对于Reduction程序,当交换范围为256时,在性能平均损失为4%的情况下功耗降低幅度最大为7%;而对于Bitonic程序,当交换范围为256和512时,在没有功耗开销的情况下,性能分别最大提升了6.4%和5.3%。相似文献

17.

VLIW处理器循环指令缓冲器设计与实现

李勇胡慧俐杨焕荣《计算机应用》2014,34(4):1005-1009

数字信号处理软件中循环程序在执行时间上占有很大比例,用指令缓冲器暂存循环代码可以减少程序存储器的访问次数,提高处理器性能。在VLIW处理器指令流水线中增加一个支持循环指令的缓冲器,该缓冲器能够缓存循环程序指令,并以软件流水的形式向功能部件派发循环程序指令。这样循环程序代码只需访存一次而执行多次,大大减少了访存次数。在循环指令运行期间,缓冲器发出信号使程序存储器进入睡眠状态可以降低处理器功耗。典型的应用程序测试表明,使用了循环缓冲后,取指流水线空闲率可达90%以上,处理器整体性能提高10%左右,而循环缓冲的硬件面积开销大约占取指流水线的9%。相似文献

18.

COMPASS – A tool for evaluation of compression strategies for embedded processors

Sreejith K. Priti 《Journal of Systems Architecture》2008,54(10):995-1003

A major concern of embedded system architects is the design for low power. We address one aspect of the problem in this paper, namely the effect of executable code compression. There are two benefits of code compression – firstly, a reduction in the memory footprint of embedded software, and secondly, potential reduction in memory bus traffic and power consumption. Since decompression has to be performed at run time it is achieved by hardware. We describe a tool called COMPASS which can evaluate a range of strategies for any given set of benchmarks and display compression ratios. Also, given an execution trace, it can compute the effect on bus toggles, and cache misses for a range of compression strategies. The tool is interactive and allows the user to vary a set of parameters, and observe their effect on performance. We describe an implementation of the tool and demonstrate its effectiveness. To the best of our knowledge this is the first tool proposed for such a purpose. 相似文献

19.

基于PTIDR编码的测试数据压缩算法

李国亮冯建华崔小乐《计算机辅助设计与图形学学报》2008,20(2):161-166

为减少测试数据存储量,提出一种有效的新型测试数据压缩编码--PTIDR编码,并构建了基于该编码的压缩/解压缩方案.PTIDR编码能够取得比FDR,EFDR, Alternating FDR等编码更高的压缩率,其解码器也较简单、易实现,且能有效地降低硬件开销.与Selective Huffman, CDCR编码相比,PTIDR编码能够得到较高的压缩率面积开销比.特别地,在差分测试集中0的概率满足p≥0.7610时,PTIDR编码能取得比FDR编码更高的压缩率,从而降低芯片测试成本. 相似文献

20.

Trace-based leakage energy optimisations at link time

《Journal of Systems Architecture》2007,53(1):1-20

Energy-aware compilers are becoming increasingly important for embedded systems due to the need to meet a variety of design constraints on time, code size and power consumption. This paper introduces for the first time a trace-based, link-time compiler framework on binaries for embedded systems and evaluates its potential benefits in supporting energy optimisations, especially those that exploit the interaction between compilers and architecture. We present two algorithms for reducing leakage energy in functional units and data caches, respectively. Both algorithms work uniformly at the granularity of optimisation regions that are formed by the hot traces of a program. Our experimental results using Mediabench benchmarks show that good leakage energy savings can be achieved at the cost of some small performance and code size penalties. Furthermore, by varying the granularity of optimisation regions, which is a tunable parameter, embedded application programmers can make the tradeoffs between energy savings and these associated costs. 相似文献