首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a methodology for designing sound and complete proof systems for proving progress properties of parallel programs under various fairness assumptions. Our methodology begins with a branching time temporal logic formula (CTL*) formula that expresses progress under a fairness assumption. The next step obtains an equivalent fixpoint characterization of this CTL* formula in the-calculus. The final step uses the fixpoint characterizations to extract proof systems for proving progress under the fairness constraint. The methodology guarantees that the proof rules so obtained are sound and relatively complete in the sense of Cook.  相似文献   

2.
This paper describes Parallel Proto (PProto), an integrated environment for constructing prototypes of parallel programs. Using functional and performance modeling of dataflow specifications, PProto assists in analysis of high-level software and hardware architectural tradeoffs. Facilities provided by PProto include a visual language and an editor for describing hierarchical dataflow graphs, a resource modeling tool for creating parallel architectures, mechanisms for mapping software components to hardware components, an interactive simulator for prototype interpretation, and a reuse capability. The simulator contains components for instrumenting, animating, debugging, and displaying results of functional and performance models. The Pproto environment is built on top of a substrate for managing user interfaces and database objects to provide consistent views of design objects across system tools.  相似文献   

3.
4.
Summary We consider the specification and verification of cyclic (sequential and concurrent) programs. The input-output based concept of correctness traditionally applied to functional programs is replaced by another, based on the concept of eventual behaviour. Various types of eventual behaviour are introduced. In the case of concurrency, the introduction of interface-predicates reduces the proof complexity and achieves greater readability. All specifications use explicitly the auxiliary variables of a location counter and elapsing time t.The research of N.F. was partially supported by The Fund for Aiding Research, Histadrut, The Federation of Hebrew workers in Israel  相似文献   

5.
Parallel execution is normally used to decrease the amount of time required to run a program. However, the parallel execution may require far more space than that required by the sequential execution. Worse yet, the parallel space requirement may be very much more difficult to predict than the sequential space requirement because there are more factors to consider. These include essentially nondeterministic factors that can influence scheduling, which in turn may dramatically influence space requirements. We survey some scheduling algorithms that attempt to place bounds on the amount of time and space used during parallel execution. We also outline a direction for future research. This direction takes us into the area of functional programming, where the declarative nature of the languages can help the programmer to produce correct parallel programs, a feat that can be difficult with procedural languages. Currently the high-level nature of functional languages can make it difficult for the programmer to understand the operational behavior of the program. We look at some of the problems in this area, with the goal of achieving a programming environment that supports correct, efficient parallel programs.  相似文献   

6.
A subset of ADA is introduced, ADA-CF, to study the basic synchronization and communication primitive of ADA, the rendezvous. Basing ourselves on the techniques introduced by Apt, Francez and de Roever for their CSP proof system, we develop a Hoare-style proof system for proving partial correctness properties which is sound and relatively complete. The proof system is then extended to deal with safety, deadlock, termination and failure. No prior exposure of the reader to parallel program proving techniques is presupposed. Two non-trivial example proofs are given of ADA-CF programs; the first one concerns a buffered producer-consumer algorithm, the second one a parallel sorting algorithm due to Brinch Hansen. Features of ADA expressing dynamic process creation and realtime constraints are not covered by our proof methods. Consequently, we do not claim that the methods described can be extended to full ADA without serious additional further research.  相似文献   

7.
The main issues when supporting fault tolerance based on checkpointing and rollback recovery for High‐Performance applications are related to the scalability of the introduced support, the possibility of analyzing the induced overhead and, in more general terms, the optimization of the trade‐off between failure‐free and recovery performances. In this paper we describe our contribution in fault tolerance for high‐level structured parallelism models. We take a different viewpoint w.r.t. existing contributions, by introducing a methodology to derive interesting properties to support fault tolerance. We show how to apply this methodology to a general data parallel model, deriving useful properties to introduce a class of checkpointing protocols. Thanks to this methodology, this class of protocols is not affected by the described issues. We exemplify two checkpointing protocols and the related rollback recovery techniques. For each protocol we also derive cost models statically describing the failure‐free performance, which can be used for performance tuning or to target some Quality of Service parameter. To assess the innovation of the results we analytically and experimentally compare the introduced protocols with two literature protocols. Results show that while the protocols introduced in this paper permit the definition of cost models and have a good scalability, the literature protocols do not always have these properties. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
9.
Curare, the program restructurer described in this paper automatically transforms a sequential Lisp program into an equivalent concurrent program that runs on a multiprocessor.Data dependences constrain the program's concurrent execution because, in general, two conflicting statements cannot execute in a different order without affecting the program's result. Not all dependences are essential to produce the program's result.Curare attempts to transform the program so it computes its result with fewer conflicts. An optimized program will execute with less synchronization and more concurrency. Curare then examines loops in a program to find those that are unconstrained or lightly constrained by dependences. By necessity,Curare treats recursive functions as loops and does not limit itself to explicit program loops. Recursive functions offer several advantages over explicit loops since they provide a convenient framework for inserting locks and handling the dynamic behavior of symbolic programs. Loops that are suitable for concurrent execution are changed to execute on a set of concurrent server processes. These servers execute single loop iterations and therefore need to be extremely inexpensive to invoke.Restructured programs execute significantly faster than the original sequential programs. This improvement is large enough to attract programmers to a multiprocessor, particularly since it requires little effort on their part.This research was funded by DARPA contract numbers N00039-85-C-0269 (SPUR) and N00039-84-C-0089 (XCS) and by an NSF Presidential Young Investigator award to Paul N. Hilfinger. Additional funding came from the California MICRO program (in conjunction with Texas Instruments, Xerox, Honeywell, and Phillips/Signetics).  相似文献   

10.
Parallel simulation of parallel programs for large datasets has been shown to offer significant reduction in the execution time of many discrete event models. The paper describes the design and implementation of MPI-SIM, a library for the execution driven parallel simulation of task and data parallel programs. MPI-SIM can be used to predict the performance of existing programs written using MPI for message passing, or written in UC, a data parallel language, compiled to use message passing. The simulation models can be executed sequentially or in parallel. Parallel execution of the models are synchronized using a set of asynchronous conservative protocols. The paper demonstrates how protocol performance is improved by the use of application-level, runtime analysis. The analysis targets the communication patterns of the application. We show the application-level analysis for message passing and data parallel languages. We present the validation and performance results for the simulator for a set of applications that include the NAS Parallel Benchmark suite. The application-level optimization described in the paper yielded significant performance improvements in the simulation of parallel programs, and in some cases completely eliminated the synchronizations in the parallel execution of the simulation model  相似文献   

11.
Summary This paper is about the Floyd-Hoare Principle which says that the semantics of a programming language can be formally specified by axioms and rules of inference for proving the correctness of programs written in the language. We study the simple language WP of while-programs and Hoare's system for partial correctness and we calculate the relational semantics of WP as this is determined by Hoare's logic. This calculation is possible by using relational semantics to build a completeness theorem for the logic. The resulting semantics AX we call the axiomatic relational semantics for WP. This AX is not the conventional semantics for WP: it need not be effectively computable or deterministic, for example. A large number of elegant properties of AX are proved and the Floyd-Hoare Principle is reconsidered.  相似文献   

12.
Estimating and optimizing performance for parallel programs   总被引:1,自引:0,他引:1  
Fahringer  T. 《Computer》1995,28(11):47-56
The article describes P3T, a parameter-based performance prediction tool that estimates performance for parallel programs running on distributed-memory parallel architectures. P3 T has been carefully designed to address all of the above performance estimation issues. To achieve high estimation accuracy, P 3T aggressively exploits compiler analysis and optimization information. Our method is based on modeling loop iteration spaces, array access patterns, and data distributions by intersection and volume operations on n-dimensional polytopes. The most critical architecture-specific factors, such as cache line sizes, number of cache lines available, routing policy, start-up times, message transfer time per byte, and so forth, are modeled to reflect the performance impact of the target machine. P3T has been developed in the context of the Vienna Fortran Compilation Systems (VFCS), a state-of-the-art parallelization tool for distributed-memory systems. VFCS translates Fortran programs into explicitly parallel message-passing programs. P 3T successfully guides the interactive and automatic restructuring of programs under this system. The article describes the underlying compilation and programming model and discusses the most critical design decisions made for P3T; in addition, it outlines the implementation of the parallel program parameters. Also described are the VFCS context under which P3T is applied and the P3T graphical user interface  相似文献   

13.
The amount of memory required by a parallel program may be spectacularly larger than the memory required by an equivalent sequential program, particularly for programs that use recursion extensively. Since most parallel programs are nondeterministic in behavior, even when computing a deterministic result, parallel memory requirements may vary from run to run, even with the same data. Hence, parallel memory requirements may be both large (relative to memory requirements of an equivalent sequential program) and unpredictable. Assume that each parallel program has an underlying sequential execution order that may be used as a basis for predicting parallel memory requirements. We propose a simple restriction that is sufficient to ensure that any program that will run in n units of memory sequentially can run in mn units of memory on m processors, using a scheduling algorithm that is always within a factor of two of being optimal with respect to time. Any program can be transformed into one that satisfies the restriction, but some potential parallelism may be lost in the transformation. Alternatively, it is possible to define a parallel programming language in which only programs satisfying the restriction can be written  相似文献   

14.
Low-latency, concurrent checkpointing for parallel programs   总被引:2,自引:0,他引:2  
Presents the results of an implementation of several algorithms for checkpointing and restarting parallel programs on shared-memory multiprocessors. The algorithms are compared according to the metrics of overall checkpointing time, overhead imposed by the checkpointer on the target program, and amount of time during which the checkpointer interrupts the target program. The best algorithm measured achieves its efficiency through a variation of copy-on-write, which allows the most time-consuming operations of the checkpoint to be overlapped with the running of the program being checkpointed  相似文献   

15.
《国际计算机数学杂志》2012,89(3-4):201-212
This paper is the second of a two-part series exploring the subtle correctness criterion of the absence of livelocks in parallel programs. In this paper we are concerned with the issue of proving this correctness criterion. It is shown that livelocks are not preserved by reduction, implying that reduction cannot be used directly in proving the absence of livelocks. Two applicable proof techniques are also presented. One is based on the notion of establishing sufficient conditions for livelock-freedom; the other is an extension of the well-founded set method for proving termination in sequential programs.  相似文献   

16.
We propose an axiomatic semantics for the synchronous language Gentzen, which is an instantiation of the paradigm Timed Concurrent Constraint Programming proposed by Saraswat, Jagadeesan and Gupta. We view Gentzen as a prototype of the class of state-oriented synchronous languages, since it offers the basic constructs that are shared by the languages in the class. Since synchronous concurrency cannot be simulated by arbitrary interleaving, we cannot exploit “head normal forms”, on which axiomatic theories for asynchronous process calculi are based. We suggest how axiomatic semantics for other state-oriented synchronous languages can be obtained by expressing constructs of such languages in terms of Gentzen constructs.  相似文献   

17.
Fumio Negoro 《Knowledge》2003,16(7-8):383-397
The purpose of our study is to build up relationships between requirement and source programs with our originally thought-out rules. When other rules to be derived from these original rules are applied to software development, even a single instruction in a programming language could be determined, and the program would satisfy the requirement. More specifically speaking, these rules will turn into a formula or a prototype of software programs. Hence, when the variables in the requirement are placed in the formula, we can get a required program in an automatic way.  相似文献   

18.
A simple proof method is presented for proving invariance properties of concurrent programs in priority-scheduled systems. This method is illustrated by using it to establish the correctness of a simple wait-free consensus algorithm for priority-scheduled uniprocessor systems. This consensus algorithm is of interest in its own right because is shows that atomic read and write operations are universal in priority-scheduled uniprocessor systems, i.e., they can be used to implement any shared object in such a system in a wait-free manner. This stands in contrast to fully asynchronous systems, where strong synchronization primitives such as compare-and-swap are needed for universality.  相似文献   

19.
Summary Proof rules are presented for an extension of Hoare's Communicating Sequential Processes. The rules deal with total correctness; all programs terminate in the absence of deadlock. The commands send and receive are treated symmetrically, simplifying the rules and allowing send to appear in guards. Also given are sufficient conditions for showing that a program is deadlock-free. An extended example illustrates the use of the technique.This research was supported in part by the National Science Foundation under grant MCS-7622360.  相似文献   

20.
Singh  J.P. Hennessy  J.L. Gupta  A. 《Computer》1993,26(7):42-50
Models for the constraints under which an application should be scaled, including constant problem-size scaling, memory-constrained scaling, and time-constrained scaling, are reviewed. A realistic method is described that scales all relevant parameters under considerations imposed by the application domain. This method leads to different conclusions about the effectiveness and design of large multiprocessors than the naive practice of scaling only the data set size. The primary example application is a simulation of galaxies using the Barnes-Hut hierarchical N-body method  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号