首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
JavaSplit is a portable runtime environment for distributed execution of standard multithreaded Java programs. It gains augmented computational power and increased memory capacity by distributing the threads and objects of an application among the available machines. Unlike previous works, which either forfeit Java portability by using a nonstandard Java Virtual Machine (JVM) or compromise transparency by requiring user intervention in making the application suitable for distributed execution, JavaSplit automatically executes standard multithreaded Java on any heterogenous collection of Java-enabled machines. Each machine carries out its part of the computation using nothing but its local standard (unmodified) JVM. Neither the programmer nor the person submitting the program for execution needs to be aware of JavaSplit or its distributed nature. We evaluate the efficiency of JavaSplit on several combinations of operating systems, JVM implementations, and communication hardware.  相似文献   

2.
The increasing demand for performance has stimulated the wide adoption of many-core accelerators like Intel® Xeon PhiTM Coprocessor, which is based on Intel’s Many Integrated Core architecture. While many HPC applications running in native mode have been tuned to run efficiently on Xeon Phi, it is still unclear how a managed runtime like JVM performs on such an architecture. In this paper, we present the first measurement study of a set of Java HPC applications on Xeon Phi under JVM. One key obstacle to the study is that there is currently little support of Java for Xeon Phi. This paper presents the result based on the first porting of OpenJDK platform to Xeon Phi, in which the HotSpot virtual machine acts as the kernel execution engine. The main difficulty includes the incompatibility between Xeon Phi ISA and the assembly library of Hotspot VM. By evaluating the multithreaded Java Grande benchmark suite and our ported Java Phoenix benchmarks, we quantitatively study the performance and scalability issues of JVM on Xeon Phi and draw several conclusions from the study. To fully utilize the vector computing capability and hide the significant memory access latency on the coprocessor, we present a semi-automatic vectorization scheme and software prefetching model in HotSpot. Together with 60 physical cores and tuning, our optimized JVM achieves averagely 2.7x and 3.5x speedup compared to Xeon CPU processor by using vectorization and prefetching accordingly. Our study also indicates that it is viable and potentially performance-beneficial to run applications written for such a managed runtime like JVM on Xeon Phi.  相似文献   

3.
Distributed Java virtual machine (dJVM) systems enable concurrent Java applications to transparently run on clusters of commodity computers. This is achieved by supporting Java's shared‐memory model over multiple JVMs distributed across the cluster's computer nodes. In this work, we describe and evaluate selective dynamic diffing and lazy home allocation, two new runtime techniques that enable dJVMs to efficiently support memory sharing across the cluster. Specifically, the two proposed techniques can contribute to reduce the overheads due to message traffic, extra memory space, and high latency of remote memory accesses that such dJVM systems require for implementing their memory‐coherence protocol either in isolation or in combination. In order to evaluate the performance‐related benefits of dynamic diffing and lazy home allocation, we implemented both techniques in Cooperative JVM (CoJVM), a basic dJVM system we developed in previous work. In subsequent work, we carried out performance comparisons between the basic CoJVM and modified CoJVM versions for five representative concurrent Java applications (matrix multiply, LU, Radix, fast Fourier transform, and SOR) using our proposed techniques. Our experimental results showed that dynamic diffing and lazy home allocation significantly reduced memory sharing overheads. The reduction resulted in considerable gains in CoJVM system's performance, ranging from 9% up to 20%, in four out of the five applications, with resulting speedups varying from 6.5 up to 8.1 for an 8‐node cluster of computers. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

4.
The context of this work is a practical, open‐source visualization system, called JIVE, that supports two forms of runtime visualizations of Java programs – object diagrams and sequence diagrams. They capture, respectively, the current execution state and execution history of a Java program. These diagrams are similar to those found in the UML for specifying design–time decisions. In our work, we construct these diagrams at execution time, thereby ensuring continuity of notation from design to execution. In so doing, a few extensions to the UML notation are proposed in order to better represent runtime behavior. As sequence diagrams can become long and unwieldy, we present techniques for their compact representation. A key result in this paper is a novel labeling scheme based upon regular expressions to compactly represent long sequences and an O(r2) algorithm for computing these labels, where r is the length of the input sequence, based upon the concept of ‘tandem repeats’ in a sequence. Horizontal compaction greatly helps minimize the extent of white space in sequence diagrams by the elimination of object lifelines and also by grouping lifelines together. We propose a novel extension to the sequence diagram to deal with out‐of‐model calls when the lifelines of certain classes of objects are filtered out of the visualization, but method calls may occur between in‐model and out‐of‐model calls. The paper also presents compaction techniques for multi‐threaded Java execution with different forms of synchronization. Finally, we present experimental results from compacting the runtime visualizations of a variety of Java programs and execution trace sizes in order to demonstrate the practicality and efficacy of our techniques. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

5.
Ghahramani  B. Pauley  M.A. 《Computer》2003,36(9):109-111
Java programs are executed by a Java virtual machine (JVM), which interprets intermediate compiled bytecode that is nominally platform independent. Although early versions of Java interpreted unoptimized bytecode in a relatively unsophisticated manner, recent developments including static analysis, just-in-time compilation, JVM optimization, and instruction-level optimizations have improved execution efficiency. Consequently, Java is now competitive with C and C++ for some applications and on some platforms. Despite Java's increasing popularity, there is a lingering perception that deficiencies in the language make it unsuitable for high-performance computing. In this paper we address some of those deficiencies and discuss the suitability of using Java in a distributed environment.  相似文献   

6.
Java virtual machine (JVM) crashes are often due to an invalid memory reference to the JVM heap. Before the bug that caused the invalid reference can be fixed, its location must be identified. It can be in either the JVM implementation or the native library written in C invoked from Java applications. To help system engineers identify the location, we implemented a feature using page protection that prevents threads executing native methods from referring to the JVM heap. This feature protects the JVM heap during native method execution; if the heap is referred to invalidly, it interrupts the execution by generating a page-fault exception. It then reports the location where the exception was generated. The runtime overhead for using this feature depends on the frequency of native method calls because the protection is switched on each time a native method is called. We evaluated the runtime overhead by running the SPECjvm98, SPECjbb2000, VolanoMark, and JFCMark benchmark suites on a PC with two Intel Xeon® 1.6 GHz processors. The performance loss was less than 2% for the benchmark items that do not call native methods so frequently (104 times per second) and 5%–20% for the benchmark items that do (104–105 times per second). The worst performance loss was 54%, which was recorded for a benchmark item that calls native methods 2.0×106 times per second.  相似文献   

7.
JDCS:实现高性能计算的分布式计算系统   总被引:2,自引:0,他引:2  
分布对象计算技术提供了充分利用现有网络资源的有效途径。JavaRMI是当前比较成熟的一种分布对象技术,它提供了使用Java对象的简单和直接的方法。该文建立基于JavaRMI方法的适用于高性能计算的分布式计算系统JDCS。在JDCS中由网络上的计算结点构成服务器池,为用户提供高性能的计算服务。实现结果表明该系统可以获得较高的加速比。  相似文献   

8.
《Parallel Computing》2007,33(4-5):314-327
CPU cycle sharing among distributed heterogeneous computers is the key function in large-scale volunteer computing and desktop grid applications. One important problem in large-scale distributed cycle sharing system is how to account for the amount of computation work performed by a CPU cycle provider, in a uniform and portable fashion across heterogeneous hardware and operating system platforms. Such an accounting mechanism is especially desirable when CPU resources are traded and a lack of uniform workload accounting will hinder the enforcement of market-driven CPU pricing/trading policies in distributed cycle sharing systems. Java Virtual Machine (JVM) has proved to be a good match for distributed cycle sharing because of its abilities to run applications on a wide variety of platforms without modification (portability) and to host untrusted applications (safety). In this paper, we present the design, implementation, and evaluation of an efficient, application-transparent virtual cycle accounting scheme integrated into JVM. Our scheme achieves portable workload accounting across heterogeneous computing platforms by accounting for JVM virtual instructions instead of real processor cycles. Different from the existing JVM CPU accounting mechanisms that involve bytecode rewriting, our scheme is transparent to applications and does not require visible changes to application and library code interfaces which would break applications that use Reflection API. Moreover, our scheme is efficient via the use of processor registers for accounting. Our experimental results demonstrate both high accounting accuracy and low runtime overhead of virtual cycle accounting.  相似文献   

9.
《Parallel Computing》2014,40(10):754-767
The processing of massive amounts of data on clusters with finite amount of memory has become an important problem facing the parallel/distributed computing community. While MapReduce-style technologies provide an effective means for addressing various problems that fit within the MapReduce paradigm, there are many classes of problems for which this paradigm is ill-suited. In this paper we present a runtime system for traditional MPI programs that enables the efficient and transparent out-of-core execution of distributed-memory parallel programs. This system, called BDMPI,1 leverages the semantics of MPI’s API to orchestrate the execution of a large number of MPI processes on much fewer compute nodes, so that the running processes maximize the amount of computation that they perform with the data fetched from the disk. BDMPI enables the development of efficient out-of-core parallel distributed memory codes without the high engineering and algorithmic complexities associated with multiple levels of blocking. BDMPI achieves significantly better performance than existing technologies on a single node as well as on a small cluster, and performs within 30% of optimized out-of-core implementations.  相似文献   

10.
cJVM is a Java virtual machine (JVM) which provides a single system image of a traditional JVM while executing in a distributed fashion on the nodes of a cluster. cJVM virtualizes the cluster, transparently distributing the objects and threads of any pure Java application. The aim of cJVM is to obtain improved scalability for Java server applications by distributing the application's work among the cluster's computing resources. cJVM's architecture, its unique object model, thread, and memory models, were described by Y. Aridor et al. (1999, in “International Conference on Parallel Processing, September 21–24) and at http://www.haifa.il.ibm.com/projects/systech/cjvm.html. In this article, we focus on the optimization techniques employed in cJVM to achieve high scalability. In particular, we focus on the techniques used to enhance locality, thereby reducing the amount of communication generated by cJVM. Our optimization techniques are based on three principles. First, we employ a large number of mostly simple optimizations which address caching, locality of execution, and object migration. Second, we take advantage of the Java semantics and of common usage patterns in implementing the optimizations. Third, we use speculative optimizations, taking advantage of the fact that the cJVM run-time environment can correct false speculations. We have demonstrated the usefulness of these techniques on a large (10 Kloc), real, Java application where we demonstrate an 80% efficiency on a four-node cluster. This paper discusses the various techniques used and reports our results.  相似文献   

11.
This paper presents architecture independent characterization of embedded Java workloads based on the industry standard GrinderBench benchmark which includes different classes of real world embedded Java applications. This work is based on a custom built embedded Java virtual machine (JVM) simulator specifically designed for embedded JVM modeling and embodies domain specific details such as thread scheduling, algorithms used for native CLDC APIs and runtime data structures optimized for use in embedded systems. The results presented include dynamic execution characteristics, dynamic bytecode instruction mix, application and API workload distribution, object allocation statistics, instruction-set coverage, memory usage statistics and method code and stack frame characteristics.  相似文献   

12.
石学锋  陈智  李政道 《计算机工程》2007,33(17):273-274
为了增强MHP机顶盒的网络交互能力,必须构建Java运行环境。该文介绍了Java技术、MHP机顶盒软件结构模型、嵌入式Java虚拟机(JVM)概念以及开源Java虚拟机——Kaffe的软件层次结构,阐述了Kaffe在ALI公司数字电视机顶盒开发平台上移植的实现过程,提出了在嵌入式环境下,Java虚拟机执行引擎的性能优化策略。实际运行结果证明了JVM的移植性和性能优化策略的可行性。  相似文献   

13.
This paper describes JaRec, a portable record/replay system for Java. It correctly replays multi‐threaded, data‐race free Java applications, by recording the order of synchronization operations, and by executing them in the same order during replay. The record/replay infrastructure is developed in Java, and does not require a modification of the Java Virtual Machine (JVM) if it provides the JVM Profiler Interface (JVMPI). If the JVM does not support JVMPI, which is used for intercepting the loaded classes, only a minor modification to the JVM is required in order to run the system. On ystems with limited memory resources, JaRec can be executed in a distributed fashion. This also makes it suitable to aid debugging of multi‐threaded applications on embedded systems. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

14.
The paper research is concerned with enabling parallel, high-performance computation—in particular development of scientific software in the network-aware programming language, Java. Traditionally, this kind of computing was done in Fortran. Arguably, Fortran is becoming a marginalized language, with limited economic incentive for vendors to produce modern development environments, optimizing compilers for new hardware, or other kinds of associated software expected of by today’s programmers. Hence, Java looks like a very promising alternative for the future. The paper will discuss in detail a particular environment called HPJava. HPJava is the environment for parallel programming—especially data-parallel scientific programming—in Java. Our HPJava is based around a small set of language extensions designed to support parallel computation with distributed arrays, plus a set of communication libraries. A high-level communication API, Adlib, is developed as an application level communication library suitable for our HPJava. This communication library supports collective operations on distributed arrays. We include Java Object as one of the Adlib communication data types. So we fully support communication of intrinsic Java types, including primitive types, and Java object types.  相似文献   

15.
In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi‐threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full‐scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread‐safe Java messaging system—MPJ Express. The first application is the Gadget‐2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite‐domain time‐difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
本文首先定义了Java芯片系统中的地址结构对象地址、段式地址和物理地址.针对现有地址转换方法的不足,提出了一种结合段式和段页式的混合转换方法.该方法使访问对象的速度、存储空间利用率和设备复杂度等方面得到合理的折衷.文中还给出了该方法在Java芯片系统中的硬件实现方法和相关的物理存储空间分配算法,并对其进行了性能评价.  相似文献   

17.
Since Java security relies on the type-safety of the JVM, many formal approaches have been taken in order to prove the soundness of the JVM. This paper presents a new formalization of the JVM and proves its soundness. It is the first model to employ dynamic linking and bytecode verification to analyze the loading constraint scheme of Java2. The key concept required for proving the soundness of the new model is augmented value typing, which is defined from ordinary value typing combined with the loading constraint scheme. In proving the soundness of the model, it is shown that there are some problems inside the current reference implementation of the JVM with respect to our model. We also analyze the findClass scheme, newly introduced in Java2. The same analysis also shows why applets cannot exploit the type-spoofing vulnerability reported by Saraswat, which led to the introduction of the loading constraint scheme.  相似文献   

18.
In this paper we describe a way to save and restore the state of a running Java program. We achieve this on the language level, without modifying the Java virtual machine, by instrumenting the programmer's original code with a preprocessor. The automatically inserted code saves the runtime information when the program requests state saving and re-establishes the program's runtime state on restart. The current preprocessor prototype is used in a mobile agent scenario to offer transparent agent migration for Java-based mobile agents, but could generally be used to save and re-establish the execution state of any Java program.  相似文献   

19.
Nowadays, Java is used in all types of embedded devices. For these memory-constrained systems, the automatic dynamic memory manager (Garbage Collector or GC) has been always a key factor in terms of the Java Virtual Machine (JVM) performance. Moreover, in current embedded platforms, power consumption is becoming as important as performance. Thus, in this paper we present an exploration, from an energy viewpoint, of the different possibilities of memory hierarchies for high-performance embedded systems when used by state-of-the-art GCs. This is a starting point for a better understanding of the interactions between the Java applications, the memory hierarchy and the GC.Hence, we subsequently present two techniques to reduce energy consumption on Java-based embedded systems, based on exploiting GC information. The first technique uses GC execution behavior to reduce leakage energy consumption taking advantage of the low-power mode of actual multi-banked SDRAM memories and it is intended for generational collectors. This technique can achieve a reduction up to 50% of SDRAM memory leakage.The second technique involves the inclusion of a software-controlled (scratch-pad) memory that stores GC instructions under the JVM control to reduce the active energy consumption and also improve the performance of the target embedded system and it is aimed at all kind of garbage collectors. For this last technique we have experimented with two different approaches for selecting the GC code to be stored in the scratchpad memory: one static and one dynamic. Our experimental results show that the proposed dynamic scratchpad management approach for GCs enables up to 63% energy consumption reduction and 25% performance improvement during the collector phase, which means, in terms of JVM execution, a global reduction of 29% and 17% for energy and cycles, respectively.Overall, this work outlines that the key for an efficient low-power implementation of Java Virtual Machines for high-performance embedded systems is the synergy between the GC choice, the memory architecture tuning, and the inclusion of power management schemes controlled by the JVM, exploiting knowledge of the GC behavior.  相似文献   

20.
In this work, a software ageing analysis of Java‐based software systems is conducted. The JVM is the core layer in Java‐based systems, and its dependability greatly affects the overall system quality. Starting from an experimental campaign on a real‐world test bed, this work isolates the contribution of the JVM to the overall ageing trend, and identifies, through statistical methods, which workload parameters are the most relevant to ageing dynamics. Results revealed the presence of several ageing dynamics in the JVM, including (i) a throughput loss trend mainly dependent on the execution unit, (ii) a slow memory depletion drift due to the just‐in‐time‐compiler activity and (iii) a fast memory depletion drift caused by dynamics inside the garbage collector. The outlined procedure and obtained results are useful in order to (i) identify the presence of ageing phenomena, (ii) perform online ageing detection and time‐to‐exhaustion prediction and (iii) define optimal rejuvenation techniques. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号