首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Large-scale data visualization using parallel data streaming   总被引:2,自引:0,他引:2  
We present an architectural approach based on parallel data streaming to enable visualizations on a parallel cluster. Our approach requires less memory than other visualizations while achieving high code reuse. We implemented our architecture within the Visualization Toolkit (VTK). It includes specific additions to support message passing interfaces (MPIs); memory limit-based streaming of both implicit and explicit topologies; translation of streaming requests between topologies; and passing data and pipeline control between shared, distributed, and mixed memory configurations. The architecture directly supports both sort-first and sort-last parallel rendering  相似文献   

2.
Debuggers play an important role in developing parallel applications. They are used to control the state of many processes, to present distributed information in a concise and clear way, to observe the execution behavior, and to detect and locate programming errors. More sophisticated debugging systems also try to improve understanding of global execution behavior and intricate details of a program. In this paper we describe the design and implementation of SPiDER, which is an interactive source‐level debugging system for both regular and irregular High‐Performance Fortran (HPF) programs. SPiDER combines a base debugging system for message‐passing programs with a high‐level debugger that interfaces with an HPF compiler. SPiDER, in addition to conventional debugging functionality, allows a single process of a parallel program to be expected or the entire program to be examined from a global point of view. A sophisticated visualization system has been developed and included in SPiDER to visualize data distributions, data‐to‐processor mapping relationships, and array values. SPiDER enables a programmer to dynamically change data distributions as well as array values. For arrays whose distribution can change during program execution, an animated replay displays the distribution sequence together with the associated source code location. Array values can be stored at individual execution points and compared against each other to examine execution behavior (e.g. convergence behavior of a numerical algorithm). Finally, SPiDER also offers limited support to evaluate the performance of parallel programs through a graphical load diagram. SPiDER has been fully implemented and is currently being used for the development of various real‐world applications. Several experiments are presented that demonstrate the usefulness of SPiDER. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

3.
在分布存储并行计算消息传递系统中,许多广播通信中的消息传递路径是对程序员透明的,程序员不能改变消息传递路径,但应用程序运行时的情况很复杂。程序员根据计算环境及应用程序特征选择消息传递路径,有助于提高广播通信的效能。在通信过程中,消息标志是用来区分消息的,以便接受进程能正确接受消息。然后,消息标志易导致应用程序出错,而且消息标志增加编制程序的复杂性。文中首先给出了逻辑拓扑结构的形式定义及基本性质,提  相似文献   

4.
This article focuses on the effect of both process topology and load balancing on various programming models for SMP clusters and iterative algorithms. More specifically, we consider nested loop algorithms with constant flow dependencies, that can be parallelized on SMP clusters with the aid of the tiling transformation. We investigate three parallel programming models, namely a popular message passing monolithic parallel implementation, as well as two hybrid ones, that employ both message passing and multi-threading. We conclude that the selection of an appropriate mapping topology for the mesh of processes has a significant effect on the overall performance, and provide an algorithm for the specification of such an efficient topology according to the iteration space and data dependencies of the algorithm. We also propose static load balancing techniques for the computation distribution between threads, that diminish the disadvantage of the master thread assuming all inter-process communication due to limitations often imposed by the message passing library. Both improvements are implemented as compile-time optimizations and are further experimentally evaluated. An overall comparison of the above parallel programming styles on SMP clusters based on micro-kernel experimental evaluation is further provided, as well.  相似文献   

5.
Interaction is critical to effective visualization, but can be difficult to author and debug due to dependencies among input events, program state, and visual output. Recent advances leverage reactive semantics to support declarative design and avoid the “spaghetti code” of imperative event handlers. While reactive programming improves many aspects of development, textual specifications still fail to convey the complex runtime dynamics. In response, we contribute a set of visual debugging techniques to reveal the runtime behavior of reactive visualizations. A timeline view records input events and dynamic variable updates, allowing designers to replay and inspect the propagation of values step‐by‐step. On‐demand annotations overlay the output visualization to expose relevant state and scale mappings in‐situ. Dynamic tables visualize how backing datasets change over time. To evaluate the effectiveness of these techniques, we study how first‐time Vega users debug interactions in faulty, unfamiliar specifications; with no prior knowledge, participants were able to accurately trace errors through the specification.  相似文献   

6.
In this paper, without assuming balanced network topologies, we address the weighted average consensus problem for discrete‐time single‐integrator multi‐agent systems with logarithmic quantized information communication. By incorporating generalized quadratic Lyapunov function with the discrete‐time Bellman–Gronwall inequality, a new upper bound about the quantization precision parameter of the infinite‐level logarithmic quantizer is derived to design quantized protocol, under which agents in strongly connected directed networks can attain weighted average consensus. The obtained new upper bound clearly characterizes the intimate relation between the quantization precision parameter and the directed network topology. The proposed quantized protocol is particularly applicable to digital networks where balanced message passing among agents is not available.  相似文献   

7.
并行软件开发环境的研究已日益成为并行计算和并行处理的重点。本文简介了可移植的消息传递环境PVM,讨论了以此为目标的图形监视环境XPVM,阐明XPVM环境与实际需要的PVM并行调试环境之间的差距,并在此基础上探讨了并行调试环境开发中的技术难点及其设计要求。  相似文献   

8.
Developing high‐quality, error‐free message‐passing concurrent programs is not trivial. Although a number of different primitives with associated semantics are available to assist such development, they often increase the complexity of the testing process. In this paper, we extend our previous test model for message‐passing programs and present new structural testing criteria, taking into account additional features used in this paradigm, such as collective communication, non‐blocking sends, distinct semantics for non‐blocking receives, and persistent operations. Our new model also recognizes that sender primitives cannot always be matched with every receive primitive. This improvement allows us to remove statically a significant number of infeasible synchronization edges that would otherwise have to be analyzed later by the tester. In this paper, the test model is presented using the Message‐Passing Interface standard; however, our new model has been designed to be flexible, and it can be configured to support a range of different message‐passing environments or languages. We have carried out case studies showing the applicability of the new test model to represent message‐passing programs and also to reveal errors, mainly those errors related to inter‐process communication. In addition to increasing the number of features supported by the test model, we have also reduced the overall cost of testing significantly. Our case studies suggest that the number of synchronization edges can be reduced by up to 93%, mainly by eliminating infeasible edges between unmatchable communication primitives. The main contribution of the paper is to present a more flexible test model that provides improved coverage for message‐passing programs and at the same time reduces the cost of testing significantly. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
Creating a formal specification for a design is an error-prone process. At the same time, debugging incorrect specifications is difficult and time consuming. In this work, we propose a debugging method for formal specifications that does not require an implementation. We handle conflicts between a formal specification and the informal design intent using a simulation-based refinement loop, where we reduce the problem of debugging overconstrained specifications to that of debugging unrealizability. We show how model-based diagnosis can be applied to locate an error in an unrealizable specification. The diagnosis algorithm computes properties and signals that can be modified in such a way that the specification becomes realizable, thus pointing out potential error locations. In order to fix the specification, the user must understand the problem. We use counterstrategies to explain conflicts in the specification. Since counterstrategies may be large, we propose several ways to simplify them. First, we compute the counterstrategy not for the original specification but only for an unrealizable core. Second, we use a heuristic to search for a countertrace, i.e., a single input trace which necessarily leads to a specification violation. Finally, we present the countertrace or the counterstrategy as an interactive game against the user, and as a graph summarizing possible plays of this game. We introduce a user-friendly implementation of our debugging method and present experimental results for GR(1) specifications.  相似文献   

10.
Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in multithreading and networking support. As a consequence, several initiatives have been taken to develop a high‐performance Java message‐passing library to program distributed memory architectures, such as clusters. The performance of Java message‐passing applications relies heavily on the communications performance. Thus, the design and implementation of low‐level communication devices that support message‐passing libraries is an important research issue in Java for HPC. MPJ Express is our Java message‐passing implementation for developing high‐performance parallel Java applications. Its public release currently contains three communication devices: the first one is built using the Java New Input/Output (NIO) package for the TCP/IP; the second one is specifically designed for the Myrinet Express library on Myrinet; and the third one supports thread‐based shared memory communications. Although these devices have been successfully deployed in many production environments, previous performance evaluations of MPJ Express suggest that the buffering layer, tightly coupled with these devices, incurs a certain degree of copying overhead, which represents one of the main performance penalties. This paper presents a more efficient Java message‐passing communications device, based on Java Input/Output sockets, that avoids this buffering overhead. Moreover, this device implements several strategies, both in the communication protocol and in the HPC hardware support, which optimizes Java message‐passing communications. In order to evaluate its benefits, this paper analyzes the performance of this device comparatively with other Java and native message‐passing libraries on various high‐speed networks, such as Gigabit Ethernet, Scalable Coherent Interface, Myrinet, and InfiniBand, as well as on a shared memory multicore scenario. The reported communication overhead reduction encourages the upcoming incorporation of this device in MPJ Express ( http://mpj‐express.org ). Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

11.
Debuggers are an integral part, albeit often neglected, of the development of distributed applications. Ambient-oriented programming (AmOP) is a distributed paradigm for applications running on mobile ad hoc networks. In AmOP the complexity of programming in a distributed setting is married with the network fragility and open topology of mobile applications. To our knowledge, there is no debugging approach that tackles both these issues. In this paper we argue that a novel kind of distributed debugger that we term an ambient-oriented debugger, is required. We present REME-D (read as remedy), an online ambient-oriented debugger that integrates techniques from distributed debugging (event-based debugging, message breakpoints) and proposes facilities to deal with ad hoc, fragile networks – epidemic debugging, and support for frequent disconnections.  相似文献   

12.
Jacobi迭代算法是解线性方程组的最常用的方法,具有广泛的应用。Jacobi迭代属于计算密集型[1],将并行计算技术应用到Jacobi迭代中,具有重要的意义。通过使用消息传递编程模型mpi提供的向量数据类型和虚拟进程拓扑来实现Jacobi迭代的并行化。  相似文献   

13.
A common debugging strategy involves reexecuting a program (on a given input) over and over, each time gaining more information about bugs. Such techniques can fail on message-passing parallel programs. Because of nondeterminacy, different runs on the given input may produce different results. This nonrepeatability is a serious debugging problem, since an execution cannot always be reproduced to track down bugs. This paper presents a technique for tracing and replaying message-passing programs. By tracing the order in which messages are delivered, a reexecution can be forced to deliver messages in their original order, reproducing the original execution. To reduce the overhead of such a scheme, we show that the delivery'order of only messages involved inraces need be traced (and not every message). Our technique makes run-time decisions to detect and trace racing messages and is usuallyoptimal in the sense that the minimal number of racing messages is traced. Experiments indicate that only 1% of the messages are often traced, gaining a reduction of two orders of magnitude over traditional techniques that trace every message. These traces allow an execution to be reproduced any number of times for debugging. Our work is novel in that we adaptively decide what to trace, and trace only those messages that introduce nondeterminacy. With our strategy, large reductions in trace size allow long-running programs to be replayed that were previously unmanageable. In addition, the reduced tracing requirements alleviate tracing bottle-necks, allowing executions to be debugged with substantially lower execution time overhead.This work was supported in part by National Science Foundation grants CCR-8815928 and CCR-9100968, Office of Naval Research grant N00014-89-J-1222, and a grant from Sequent Computer Systems, Inc.  相似文献   

14.
A methodology for the design and development of data-parallel applications and components is presented. Data-parallelism is a well understood form of parallel computation, yet developing simple applications can involve substantial efforts to express the problem in low level notations. We describe a process of software development for data-parallel applications starting from high level specifications, generating repeated refinements of designs to match different architectural models and performance constraints, enabling a development activity with cost benefit analysis. Primary issues are algorithm choice, correctness, and efficiency, followed by data decomposition, load balancing, and message passing coordination. Development of a data-parallel multitarget tracking application is used as a case study, showing the progression from high to low level refinements. We conclude by describing tool support for the process  相似文献   

15.
This paper describes Parallel Proto (PProto), an integrated environment for constructing prototypes of parallel programs. Using functional and performance modeling of dataflow specifications, PProto assists in analysis of high-level software and hardware architectural tradeoffs. Facilities provided by PProto include a visual language and an editor for describing hierarchical dataflow graphs, a resource modeling tool for creating parallel architectures, mechanisms for mapping software components to hardware components, an interactive simulator for prototype interpretation, and a reuse capability. The simulator contains components for instrumenting, animating, debugging, and displaying results of functional and performance models. The Pproto environment is built on top of a substrate for managing user interfaces and database objects to provide consistent views of design objects across system tools.  相似文献   

16.
The Hyper-Ring (HR) is presented as a hierarchical and scalable ring-based topology for small-scale to massively parallel systems which eliminates the major disadvantages of large-scale rings. With a fixed node degree, a low cost, symmetric properties, and a simple routing scheme, the HR topology is very suitable for small-scale to large-scale multicomputer systems. Assuming pipelined communication, the performance of 4- and 5-dimensional HR multicomputers is modeled, the performance model is evaluated, and the results of the performance model evaluation are analyzed. Moreover, the impact of the traffic load and message length on the system performance is analyzed. The major objective of this work is to shed light on how to cluster HRs in order to optimize the system efficiency. Assuming a uniform message arrival rate into the nodes of the HR, the results show that the efficiency of HR topologies with an equal number of nodes is best when the topologies are perfectly balanced. The next best-performing HRs are those with larger rings at the lower (outer) levels and smaller rings at the higher levels (near the root ring). The results confirm that the HR topology is suitable for massively parallel and scalable multicomputer systems as well as for networks of workstations.  相似文献   

17.
Our focus is on the novel use of a process-oriented methodology in software systems for parallel simulation on distributed memory. To the best of our knowledge, the few existing systems which adopt a process view strictly use message passing to effect process interaction in distributed-memory settings. As a result, these systems avoid scenarios in which processes directly access passive but shared components. This can restrict the manner in which a system is modeled and hinder the phase of distributed model construction. In this paper, we propose an approach which utilizes mobile processes in distributed-memory simulation systems. The approach entails the migration of a requesting process with its timestamp to the remote site hosting the requested passive object. Major advantages of this approach include one-time transmission, fixed communication topology, and increased locality of reference. Empirical results based on lightweight processes show that the mobile process paradigm can be as efficient as the message-passing paradigm.  相似文献   

18.
Parallel programming is orders of magnitudes more complex than writing sequential programs. This is particularly true for programming distributed memory multiprocessor architectures based on message passing programming models. Apart from understanding the sequential parts of the parallel program, new degrees of freedom lead to additional problems. Understanding the synchronization and communication behavior of parallel programs is the most critical issue in programming distributed memory multiprocessors. The paper describes methods and tools for visualization and animation of the dynamic execution of parallel programs. Based on an evaluation and classification of existing visualization environments, the visualization and animation tool VISTOP (VISualization TOol for Parallel Systems) is presented as part of the integrated tool environment TOPSY S (TOols for Parallel SYStems) for programming distributed memory multiprocessors. VISTOP supports the interactive on-line visualization of message passing programs based on various views; in particular, a process graph based concurrency view for detecting synchronization and communication bugs.  相似文献   

19.
XNETi是为XNET网络互连系统设计的基于PCI总线的网络接口,可以有效地支持用户层的消息传递。本文着重介绍了XNETi中差错控制与分包/重组等功能的具体实现。  相似文献   

20.
This paper presents the Distributed InterProcess Communication System (DIPCS) as a framework for managing communication in a distributed multimedia system. Within DIPCS, connection level management is provided through a novel distributed process group model called ADP-Group communication. The ADP-Group paradigm defines a new type of group message passing, calledqos-reliable. Qos-reliable semantics are appropriate to controlling real-time multimedia communication, by allowing a spectrum of performance and reliability specifications to co-exist within one group. DIPCS also provides an abstract programming model of multimedia devices, easing control of a heterogeneous multimedia system. Distributed multimedia applications can be rapidly developed using simple group operation primitives. We show how ADP-Group message delivery semantics can be directly mapped into an efficient Integrated Services Network support policy.Supported by Computational Diagnostics, Inc., and the Benjamin Franklin Fund for the Commonwealth of Pennsylvania.Supported by the Rome Laboratory (RL) of the Air Force Material Command and the Defense Advanced Research Projects Agency (Contract F30602-93-C-0038).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号