首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
一种堆栈型Java处理器的流水线设计   总被引:1,自引:1,他引:1  
杨骥  毛峡 《计算机工程与设计》2004,25(12):2357-2359
针对目前嵌入式系统的特点,设计了一种四段流水线的堆栈型Java微处理器核。使用双口RAM作为Java栈,减小了存储资源的消耗。通过硬件在一个时钟周期内直接执行Java虚拟机(JVM)中大多数简单的算术/逻辑指令;通过微代码模拟在若干时钟周期内完成中等复杂指令处理;提供硬件陷阱机制,以支持JVM中非常复杂和面向对象指令的软件仿真。综合硬件资源和运行效率两方面的需求可灵活选择不同的指令实现方式,为Java处理器在FPGA中的移植实现提供方便。  相似文献   

2.
在嵌入式Java芯片中使用即时编译技术   总被引:1,自引:0,他引:1  
Java虚拟机具有面向堆栈与面向对象的特点,不利于硬件有效支持字节码的直接执行,传统JIT也不适应嵌入式系统的应用环境,介绍了在自行设计的嵌入式Java芯片中使用JIT的技术途径,通过对Java虚拟机堆栈和复杂指令的支持,密切配合JIT软件,较好地解决了Java芯片设计中的问题。测试结果表明,相对于目前前界最好的picoJava-Ⅱ内核而言内核而言,JC401的编译后代码性能提高了1.2至1.9倍,在硬件复杂度、执行速度、内存开销等方面都有较大程度的改善,适合于嵌入式应用。  相似文献   

3.
为能以硬件方式直接执行CISC结构的Java字节码,设计并实现适用于32位嵌入式实时Java平台的JPOR-32指令集。分析Java虚拟机规范中各Java字节码的功能和实现原理,设定执行每条指令时信号和数据在Java处理器数据通路上的变化,采用微指令方式执行复杂指令,简单指令直接执行,从而使JPOR-32的指令集具有RISC特性。实验结果验证了指令集的正确性及其最坏情况执行时间(WCET)的可预测性。  相似文献   

4.
Ghahramani  B. Pauley  M.A. 《Computer》2003,36(9):109-111
Java programs are executed by a Java virtual machine (JVM), which interprets intermediate compiled bytecode that is nominally platform independent. Although early versions of Java interpreted unoptimized bytecode in a relatively unsophisticated manner, recent developments including static analysis, just-in-time compilation, JVM optimization, and instruction-level optimizations have improved execution efficiency. Consequently, Java is now competitive with C and C++ for some applications and on some platforms. Despite Java's increasing popularity, there is a lingering perception that deficiencies in the language make it unsuitable for high-performance computing. In this paper we address some of those deficiencies and discuss the suitability of using Java in a distributed environment.  相似文献   

5.
传统的Java程序利用软件Java虚拟机(Java Virtual Machine,JVM)对Java字节码文件进行解释或二次编译后交由本地CPU执行,其运行速度大大受限,而硬件JVM处理器可直接执行Java字节码,因而大幅提高了Java程序的运行速度,所以硬件JVM处理器是突破Java程序性能瓶颈的最有效方法.本文以Jop Java及picoJava为例,根据Java虚拟机的规范分析了硬件JVM处理器中最重要的流水线结构、堆栈结构及操作的实现方式、指令折叠技术和字节码与微码的映射技术,并提出了改进措施.  相似文献   

6.
We present a study of the static structure of real Java bytecode programs. A total of 1132 Java jar‐files were collected from the Internet and analyzed. In addition to simple counts (number of methods per class, number of bytecode instructions per method, etc.), structural metrics such as the complexity of control‐flow and inheritance graphs were computed. We believe this study will be valuable in the design of future programming languages and virtual machine instruction sets, as well as in the efficient implementation of compilers and other language processors. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

7.
性能问题一直是Java无法回避的一个弱点。然而造成性能低下的原因除了Java本身的原因外,很多时候是由于应用没有优化地使用Java造成的。虚拟机是Java平台的核心,研究Java虚拟机(Java virtual machine,简称JVM)的关键技术及运行机制,并分析其性能优化措施,使Java在不同的平台上顺利运行,为编程实现JVM或向各种平台移植JVM提供参考。  相似文献   

8.
This paper describes a new method for code space optimization for interpreted languages called LZW‐CC . The method is based on a well‐known and widely used compression algorithm, LZW , which has been adapted to compress executable program code represented as bytecode. Frequently occurring sequences of bytecode instructions are replaced by shorter encodings for newly generated bytecode instructions. The interpreter for the compressed code is modified to recognize and execute those new instructions. When applied to systems where a copy of the interpreter is supplied with each user program, space is saved not only by compressing the program code but also by automatically removing the unused implementation code from the interpreter. The method's implementation within two compiler systems for the programming languages Haskell and Java is described and implementation issues of interest are presented, notably the recalculations of target jumps and the automated tailoring of the interpreter to program code. Applying LZW‐CC to nhc98 Haskell results in bytecode size reduction by up to 15.23% and executable size reduction by up to 11.9%. Java bytecode is reduced by up to 52%. The impact of compression on execution speed is also discussed; the typical speed penalty for Java programs is between 1.8 and 6.6%, while most compressed Haskell executables run faster than the original. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

9.
《Parallel Computing》2007,33(4-5):314-327
CPU cycle sharing among distributed heterogeneous computers is the key function in large-scale volunteer computing and desktop grid applications. One important problem in large-scale distributed cycle sharing system is how to account for the amount of computation work performed by a CPU cycle provider, in a uniform and portable fashion across heterogeneous hardware and operating system platforms. Such an accounting mechanism is especially desirable when CPU resources are traded and a lack of uniform workload accounting will hinder the enforcement of market-driven CPU pricing/trading policies in distributed cycle sharing systems. Java Virtual Machine (JVM) has proved to be a good match for distributed cycle sharing because of its abilities to run applications on a wide variety of platforms without modification (portability) and to host untrusted applications (safety). In this paper, we present the design, implementation, and evaluation of an efficient, application-transparent virtual cycle accounting scheme integrated into JVM. Our scheme achieves portable workload accounting across heterogeneous computing platforms by accounting for JVM virtual instructions instead of real processor cycles. Different from the existing JVM CPU accounting mechanisms that involve bytecode rewriting, our scheme is transparent to applications and does not require visible changes to application and library code interfaces which would break applications that use Reflection API. Moreover, our scheme is efficient via the use of processor registers for accounting. Our experimental results demonstrate both high accounting accuracy and low runtime overhead of virtual cycle accounting.  相似文献   

10.
High-performance just-in-time compilers for Java need to invest considerable effort before actual code generation can commence. This is in part due to the very nature of the Java Virtual Machine, which is not well matched to the requirements of optimizing code generators. Alternative transportation formats based on Static Single Assignment form should theoretically be superior to virtual machines, but this claim has not previously been validated in practice. This paper revisits the topic and attempts to quantify the effect of using an SSA-based mobile code representation (IR) instead of a virtual-machine based one.To this end, we have integrated full support for a verifiable SSA-based IR into Jikes RVM, an existing Java execution environment. The resulting system is capable of loading and executing Java programs represented in either format, traditional JVM bytecode as well as the SSA-based representation, and it can even execute programs made up of a mixture of the two formats. In our implementation, the two alternative just-in-time compilation pipelines share a common low-level code generator.Performance results are encouraging and show simultaneous improvements in both compilation time and code quality relative to Jikes RVM's standard optimizing compiler for JVM class files. They support the hypothesis that SSA-based intermediate representations offer advantages in the context of just-in-time compilation.  相似文献   

11.
为了对Java虚拟机(JVM)进行测试,开发人员通常需要手工设计或利用测试生成工具生成复杂的测试程序,从而检测JVM中潜在的缺陷。然而,复杂的测试程序给开发人员定位及修复缺陷带来了极高的成本。测试程序约简技术旨在保障测试程序缺陷检测能力的同时,尽可能的删减测试程序中与缺陷检测无关的代码。现有研究工作基于Delta调试在C程序和XML输入上可以取得较好的约简效果,但是在JVM测试场景中,具有复杂语法和语义依赖关系的Java测试程序约减仍存在粒度较粗、约简效果较差的问题,导致约简后的程序理解成本依然很高。因此,针对具有复杂程序依赖关系的Java测试程序,本文提出一种基于程序约束的细粒度测试程序约简方法JavaPruner。首先在语句块级别设计细粒度的代码度量方法,随后在Delta调试技术上引入语句块之间的依赖约束关系来对测试程序进行约简。以Java字节码测试程序为实验对象,通过从现有的针对JVM测试的测试程序生成工具中筛选出具有复杂依赖关系的50个测试程序作为基准数据集,并在这些数据集上验证JavaPruner的有效性。实验结果表明,JavaPruner可以有效删减Java字节码测试程序中的冗余代码。与现有方法相比,在所有基准数据集上约减能力平均可提升37.7%。同时,JavaPruner可以在保障程序有效性及缺陷检测能力的同时将Java字节码测试程序最大约简至其原有大小的1.09% ,有效降低了测试程序的分析和理解成本。  相似文献   

12.
为支持线程间的同步,Java虚拟机中引入了监视器进入与退出指令,但这会在大部分的Java程序中产生严重的性能问题,在目前的软件实现方法中,存在内存开销大或性能较低等问题。因此,picoJava内核对监视器进行了硬件支持,能大大提高性能,降低内存开销,但是,它却存在进入命中率较低的问题,根据Java程序中监视器操作具有再入频率低但格局性好的特征,提出了一种高效的硬件支持方案及相应算法,有效地提高了命  相似文献   

13.
刘超 《计算机工程》2007,33(7):84-86
Java语言作为一种跨平台的编程语言在企业应用开发、桌面应用开发及嵌入式开发上获得了广泛的应用。为了在龙芯上运行Java程序,将Sun HotSpot Java虚拟机移植到了Linux/龙芯2上,该文描述了移植过程中的主要工作、遇到的问题及解决的方法和优化工作。  相似文献   

14.
Mnemonics is a Scala library for generating method bodies in JVM bytecode at run time. Mnemonics supports a large subset of the JVM instructions, for which the static typing of the generator guarantees the well-formedness of the generated bytecode.  相似文献   

15.
Bytecode verification is one of the key security functions of several architectures for mobile and embedded code, including Java, Java Card, and .NET. Over the past few years, its formal correctness has been studied extensively by academia and industry, using general-purpose theorem provers. The objective of our work is to facilitate such endeavors by providing a dedicated environment for establishing the correctness of bytecode verification within a proof assistant. The environment, called Jakarta, exploits a methodology that casts the correctness of bytecode verification relatively to a defensive virtual machine that performs checks at run-time and to an offensive one that does not; it can be summarized as stating that the two machines coincide on programs that pass bytecode verification. Such a methodology has been used successfully to prove the correctness of the Java Card bytecode verifier and may potentially be applied to many similar problems. One definite advantage of the methodology is that it is amenable to automation. Indeed, Jakarta automates the construction of an offensive virtual machine and a bytecode verifier from a defensive machine, and the proofs of correctness of the bytecode verifier. We illustrate the principles of Jakarta on a simple low-level language extended with subroutines and discuss its usefulness to proving the correctness of the Java Card platform.  相似文献   

16.
Abstract machines bridge the gap between the high-level of programming languages and the low-level mechanisms of a real machine. The paper proposed a general abstract-machine-based framework (AMBF) to build instruction level parallelism processors using the instruction tagging technique. The constructed processor may accept code written in any (abstract or real) machine instruction set, and produce tagged machine code after data conflicts are resolved. This requires the construction of a tagging unit which emulates the sequential execution of the program using tags rather than actual values. The paper presents a Java ILP processor by using the proposed framework. The Java processor takes advantage of the tagging unit to dynamically translate Java bytecode instructions to RISC-like tag-based instructions to facilitate the use of a general-purpose RISC core and enable the exploitation of instruction level parallelism. We detailed the Java ILP processor architecture and the design issues. Benchmarking of the Java processor using SpecJVM98 and Linpack has shown the overall ILP speedup improvement between 78% and 173%.  相似文献   

17.
This paper presents architecture independent characterization of embedded Java workloads based on the industry standard GrinderBench benchmark which includes different classes of real world embedded Java applications. This work is based on a custom built embedded Java virtual machine (JVM) simulator specifically designed for embedded JVM modeling and embodies domain specific details such as thread scheduling, algorithms used for native CLDC APIs and runtime data structures optimized for use in embedded systems. The results presented include dynamic execution characteristics, dynamic bytecode instruction mix, application and API workload distribution, object allocation statistics, instruction-set coverage, memory usage statistics and method code and stack frame characteristics.  相似文献   

18.
The Java Virtual Machine is primarily designed for transporting Java programs. As a consequence, when JVM bytecodes are used to transport programs in other languages, the result becomes less acceptable the more the source language diverges from Java. Microsoft's .NET transport format fares better in this respect because it has a more flexible type system and instruction set, but it is not extensible, and (for example) has no provision for supporting explicit programmer-specified parallelism. Both platforms have difficulty making transported programs run efficiently.This paper discusses first steps towards mobile code representations that are independent (in the sense that the representation can be appropriately parameterized) of the source language (e.g., Java), intermediate representation (e.g., bytecode), and target architecture (e.g., x86). We call this kind of parameterizable framework language-agnostic.We present two techniques which provide parts of the envisioned language-agnostic functionality. Compressed abstract syntax trees as a wire format provide for a very dense encoding of programs at a high level of abstraction. We show how to parameterize the compression algorithm in a modular fashion with knowledge beyond the purely syntactical level. This leads to the notion of well-formedness by construction. The second technique defines the semantics of programs by mapping from abstract syntax trees to a typed core calculus representation. Based on this representation it becomes possible to use portable definitions of security policies and to execute programs written in different source languages, even if a more efficient trusted native compiler is not available on the target platform.  相似文献   

19.
Hardware bytecode translation is a technique to improve the performance of the Java virtual machine (JVM), especially on the portable devices for which the overhead of dynamic compilation is significant. However, since the translation is done on a single bytecode basis, a naive implementation of the JVM generates frequent memory accesses for local variables which can be not only a performance bottleneck but also an obstacle for instruction folding. A solution to this problem is to add a small register file to the data path of the microprocessor which is dedicated for storing local variables. However, the effectiveness of such a local variable register file depends on the size and the local variable access behavior of the applications.In this paper, we analyze the local variable access behavior of various Java applications. In particular, we will investigate the fraction of local variable accesses that are covered by the register file of a varying size, which determines the chip area overhead and the operation speed. We also evaluate the effectiveness of the sliding register window for parameter passing in context of JVM and on-the-fly optimization of local variable to register file mapping.With two types of exceptions, a 16-entry register file achieves coverages of up to 98%. The first type of exception is represented by the SAXON XSLT processor for which the effect of cold miss is significant. Adding the sliding window feature to the register file for parameter passing turns 6.2-13.3% of total accesses from miss to hit to the register file for the SAXON with XSLTMark. The second type of exception is represented by the FFT, which accesses more than 16 local variables for most of method invocations. In this case, on-the-fly profiling is effective. The hit ratio of a 16-entry register file for the FFT is increased from 44% to 83% by an array of 8-bit counters.  相似文献   

20.
Java just-in-time (JIT) compilers improve the performance of a Java virtual machine (JVM) by translating Java bytecode into native machine code on demand. One important problem in Java JIT compilation is how to map stack entries and local variables to registers efficiently and quickly, since register-based computations are much faster than memory-based ones, while JIT compilation overhead is part of the whole running time. This paper introduces LaTTe, an open-source Java JIT compiler that performs fast generation of efficiently register-mapped RISC code. LaTTe first maps "all" local variables and stack entries into pseudoregisters, followed by real register allocation which also coalesces copies corresponding to pushes and pops between local variables and stack entries aggressively. Our experimental results indicate that LaTTe's sophisticated register mapping and allocation really pay off, achieving twice the performance of a naive JIT compiler that maps all local variables and stack entries to memory. It is also shown that LaTTe makes a reasonable trade-off between quality and speed of register mapping and allocation for the bytecode. We expect these results will also be beneficial to parallel and distributed Java computing: 1) by enhancing single-thread Java performance; and 2) by significantly reducing the number of memory accesses which the rest of the system must properly order to maintain coherence and keep threads synchronized  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号