首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
任爱华  杜悦冬 《软件学报》2001,12(7):1064-1073
多处理机环境下的实时系统具有并发事件驱动性质,其软件结构展现了多重同步点以及生产者与消费者之间的关系,这导致了复杂的控制结构.对于此类系统软件的开发缺少标准的方法和工具,造成了软件低效、程序结构不清晰、开发成本高、维护困难的现象的出现.根据Petri网易于描述并行/并发现象的特点,采用它来解决多处理机软件的描述问题,介绍了一种以Petri网图形方式在多处理机系统环境下进行程序设计的方法.该方法基于两种程序设计级别:任务级和作业级.前者负责描述基本操作,由单一控制线程完成;后者用于并行/并发程序建模,由整个多处理机系统来执行.在作业级程序设计中,用户采用面向对象Petri网来描述并行程序结构,以建立系统模型.该方法以一种接近于程序员的思维方式去设计并发软件,提供了一种可靠的并行结构的程序.阐述了支持此种程序设计方法的操作系统结构及其实现原理.  相似文献   

2.
周杰  李文敬 《计算机科学》2017,44(Z11):586-591, 595
为解决多核机群Petri网并行化过程中,运用MPI+OPenMP混合编程实现同步会出现死锁的问题,提出了基于三层混合编程模型的Petri网并行算法。首先,根据事务内存的同步优势,在多核机群环境下构建MPI+OPenMP+STM的三层编程模型;然后,对Petri网的几何模型与代数模型的并行化进行分析,建立MPI+OPenMP+STM三层结构的Petri网并行模型,并对三层混合编程模型的Petri网并行算法进行设计与分析;最后,通过示例进行编程验证,该算法的运行效率明显优于其他编程模式,而且Petri网的规模越大,其并行计算的效果就越明显。因此,该算法是多核机群环境下模拟Petri网并行运行的一种高效且可行的算法。  相似文献   

3.
《Parallel Computing》1997,22(13):1837-1851
The PAPS (Performance Analysis of Parallel Systems) toolset is a testbed for the model based performance prediction of message passing parallel applications executed on private memory multiprocessor computer systems. PAPS allows to describe the execution behavior of the computer hardware and operating system software resources up to a very detailed level. This enables very accurate performance prediction of parallel applications even in the case of substantial performance degradation due to contention for shared resources. In this paper the fundamental design principles and implementation methodologies for the development of the PAPS toolset are presented and the PAPS parallel system specification formalisms are described. A simplified performance study of a parallel Gaussian elimination application on the nCUBE 2 multiprocessor system is used to demonstrate the usage of the tool.  相似文献   

4.
A quantitative performance study of two-phase locking in a parallel database machine using a simulation-based methodology is described. The DBSIM simulation methodology uses models at two levels: a Petri net model at the higher level and a queuing network model at the lower level. The Petri net model captures the characteristics of parallelism and synchronization at the workload level, while the queuing network model captures queuing aspects of the system at the physical resource level. Transactions in a workload are specified using a performance-oriented specification language based on the transaction component graph, a data flow graph with database operators. The transaction specifications are translated into Petri net representations to derive the simulation experiments. The workload is a transaction taken from an order-entry application. A shared-nothing parallel machine architecture is assumed. Results of analysis of a two-phase locking strategy with machine sizes ranging from 4 to 256 processors are presented  相似文献   

5.
This paper presents the specification and implementation procedure using a microcomputer network based autonomous distributed control architecture for industrial multirobot systems. The procedure is based on the concept of data flow network controlled by communicating sequential processes to perform coordinated tasks. Robots and other computerized industrial devices such as conveyors and manufacturing machines are defined as object-oriented Petri nets. A modular and hierarchical approach is adopted to define a set of Petri net type diagrams which represent concurrent activities of control processes for such devices. Asynchronous and synchronous interactions are modelled by places and transitions, respectively, in global process interaction nets. The control software is implemented on a computer network using Inmos transputers with true parallel processing and message passing primitives efficiently handled in hardware. Petri net based models are directly and efficiently transformed to corresponding codes in occam, the high level parallel programming language defined for the transputer.  相似文献   

6.
There is undoubtedly a need for software-design tools for parallel programming. A main problem with design tools for parallel programming is their inability to check for liveness (no deadlock) and safeness. In this paper, the use of Ordinary Petri net as a software design tool for Occam Petri Net and Occam constructs are discussed. The similarities between Ordinary Petri Net and Occam constructs are highlighted, and an Occam Petri Net model is proposed as a design tool to aid in writing Occam codes. The Occam Petri Net model is graphical. It is capable of modelling deterministic concurrent and choice systems. As a top-down design, the net is similar to Occam ‘folds’, and, in its use in bottom-up implementation, it is similar to unfolding. This unfolding using the Occam Petri Net model makes writing Occam source codes easier. The availability of Petri Net CASE tools will make it more attractive for designing Occam programs.  相似文献   

7.
8.
并行程序Petri网模型的结构性质   总被引:1,自引:0,他引:1  
正确性是并行程序的基础,但是由于它的复杂性,其验证要比串行程序困难得多,因此有必要进行建模并研究其性质.从程序的角度出发,在将基于消息传递的并行程序转换为Petri网模型之后,证明了与并行正确的并行程序对应的Petri网模型应当满足的结构性质,包括强连通性、S-不变量、T-不变量、受控死锁性质以及守恒性,并举例说明了这些性质在并行程序验证中的应用.这些性质可用于并行程序的事前验证,而且避免了使用动态性质进行验证时的状态爆炸问题,从而提高并行程序设计和验证效率.同时这些方法具有良好的可推广性.  相似文献   

9.
Algorithms from scientific computing often exhibit a two-level parallelism based on potential method parallelism and potential system parallelism. We consider the parallel implementation of those algorithms on distributed memory machines. The two-level potential parallelism of algorithms is expressed in a specification consisting of an upper level hierarchy of multiprocessor tasks each of which has an internal structure of uniprocessor tasks. To achieve an optimal parallel execution time, the parallel execution of such a program requires an optimal scheduling of the multiprocessor tasks and an appropriate treatment of uniprocessor tasks. For an important subclass of structured method parallelism we present a scheduling methodology which takes data redistributions between multiprocessor tasks into account. As costs we use realistic parallel runtimes. The scheduling methodology is designed for an integration into a parallel compiler tool. We illustrate the multitask scheduling by several examples from numerical analysis.  相似文献   

10.
For the design of classic computers the parallel programming concept is used to abstract HW/SW interfaces during high level specification of application software. The software is then adapted to existing multiprocessor platforms using a low level software layer that implements the programming model. Unlike classic computers, the design of heterogeneous MPSoC includes also building the processors and other kind of hardware components required to execute the software. In this case, the programming model hides both hardware and software refinements. This paper deals with parallel programming models to abstract both hardware and software interfaces in the case of heterogeneous MPSoC design. Different abstraction levels will be needed. For the long term, the use of higher level programming models will open new vistas for optimization and architecture exploration like CPU/RTOS tradeoffs.  相似文献   

11.
The concept of aspect-orientation allows for modularizing crosscutting concerns as aspect modules. Aspect-orientation originally emerged at the programming level, and has stretched over other development phases now. Among them aspect-oriented modeling (AOM) is a hot topic, and there are many approaches supporting it. Petri net is a good formalism which can provide the foundations for modeling software and simulating its execution, but fails to resolve the problem of crosscutting concerns to support AOM. So, this paper presents an approach which extends the Petri net so as to support the AOM. In this paper, the basic functions of the system are modeled as base net by Petri net, and the crosscutting concerns are modeled as aspect nets. In order to analyze the whole system, woven mechanism is proposed to compose the aspect nets and base net together. The problems about aspect-aspect conflict and conflict relations may exist among the aspect nets matching the shared join point, thus this paper propose solutions to resolve them. The Object Petri net which is an extension of traditional Petri net is also extended so as to support aspect-oriented modeling here.  相似文献   

12.
The concept of aspect-orientation allows for modularizing crosscutting concerns as aspect modules. Aspect-orientation originally emerged at the programming level, and has stretched over other development phases now. Among them aspect-oriented modeling (AOM) is a hot topic, and there are many approaches supporting it. Petri net is a good formalism which can provide the foundations for modeling software and simulating its execution, but fails to resolve the problem of crosscutting concerns to support AOM. So, this paper presents an approach which extends the Petri net so as to support the AOM. In this paper, the basic functions of the system are modeled as base net by Petri net, and the crosscutting concerns are modeled as aspect nets. In order to analyze the whole system, woven mechanism is proposed to compose the aspect nets and base net together. The problems about aspectaspect conflict and conflict relations may exist among the aspect nets matching the shared join point, thus this paper propose solutions to resolve them. The Object Petri net which is an extension of traditional Petri net is also extended so as to support aspect-oriented modeling here.  相似文献   

13.
Multiprocessor execution of functional programs   总被引:1,自引:0,他引:1  
Functional languages have recently gained attention as vehicles for programming in a concise and elegant manner. In addition, it has been suggested that functional programming provides a natural methodology for programming multiprocessor computers. This paper describes research that was performed to demonstrate that multiprocessor execution of functional programs on current multiprocessors is feasible, and results in a significant reduction in their execution times.Two implementations of the functional language ALFL were built on commercially available multiprocessors.Alfalfa is an implementation on the Intel iPSC hypercube multiprocessor, andBuckwheat is an implementation on the Encore Multimax shared-memory multiprocessor. Each implementation includes a compiler that performs automatic decomposition of ALFL programs and a run-time system that supports their execution. The compiler is responsible for detecting the inherent parallelism in a program, and decomposing the program into a collection of tasks, calledserial combinators, that can be executed in parallel.The abstract machine model supported by Alfalfa and Buckwheat is calledheterogeneous graph reduction, which is a hybrid of graph reduction and conventional stack-oriented execution. This model supports parallelism, lazy evaluation, and highe order functions while at the same time making efficient use of the processors in the system. The Alfalfa and Buckwheat runtime systems support dynamic load balancing, interprocessor communication (if required), and storage management. A large number of experiments were performed on Alfalfa and Buckwheat for a variety of programs. The results of these experiments, as well as the conclusions drawn from them, are presented.This research was supported in part by National Science Foundation grants DCR-8302018 and DCR-8521451, by a DARPA subcontract with SDC/Unisys, and by gifts from Burroughs Austin Research Center and the Intel Corporation.  相似文献   

14.
The parallel inference machine (PIM) is now being developed at ICOT. It consists of a dozen or more clusters, each of which is a tightly coupled multiprocessor (comprising about eight processing elements) with shared global memory and a common bus. Kernel language 1 (KL1), a parallel logic programming language based on Guarded Horn Clauses (GHC), is executed on each PIM cluster. This paper describes the memory access characteristics in KL1 parallel execution and a locally parallel cache mechanism with hardware lock. The most important issue of locally parallel cache design is how to reduce common bus traffic. A write-back cache protocol having five cache states specially optimized for KL1 execution on each PIM cluster is described. We introduced new software controlled memory access commands, named DW, ER, and RP. A hardware lock mechanism is attached to the cache on each processor. This lock mechanism enables efficient word-by-word locking, reducing common bus traffic by using the cache states.  相似文献   

15.
用多机系统进行并行仿真是解决大规模连续系统实时仿真问题的有效途径。多机并行仿真中关键要解决的问题,是如何有效地将一个仿真任务分配到多机系统上并发执行,并获得高的加速比。本文介绍了作者自行研制的并行仿真软件支撑环境PARSIM,它可将一个传统单机上串行执行的仿真程序自动转换成在同构型多机系统上高效并发执行的并行仿真程序,并就并行性识别,多任务自动划分等问题展开了讨论,给出了相应的算法和应用实例。  相似文献   

16.
We present a design methodology for real-time vision applications aiming at significantly reducing the design-implement-validate cycle time on dedicated parallel platforms. This methodology is based upon the concept of algorithmic skeletons, i.e., higher order program constructs encapsulating recurring forms of parallel computations and hiding their low-level implementation details. Parallel programs are built by simply selecting and composing instances of skeletons chosen in a predefined basis. A complete parallel programming environment was built to support the presented methodology. It comprises a library of vision-specific skeletons and a chain of tools capable of turning an architecture-independent skeletal specification of an application into an optimized, deadlock-free distributive executive for a wide range of parallel platforms. This skeleton basis was defined after a careful analysis of a large corpus of existing parallel vision applications. The source program is a purely functional specification of the algorithm in which the structure of a parallel application is expressed only as combination of a limited number of skeletons. This specification is compiled down to a parametric process graph, which is subsequently mapped onto the actual physical topology using a third-party CAD software. It can also be executed on any sequential platform to check the correctness of the parallel algorithm. The applicability of the proposed methodology and associated tools has been demonstrated by parallelizing several realistic real-time vision applications both on a multi-processor platform and a network of workstations. It is here illustrated with a complete road-tracking algorithm based upon white-line detection. This experiment showed a dramatic reduction in development times (hence the term fast prototyping), while keeping performances on par with those obtained with the handcrafted parallel version. Received: 22 July 1999 / Accepted: 9 November 2000  相似文献   

17.
It has been difficult to develop simple formulations to predict the execution time of parallel programs due to the complexity of characterizing parallel hardware and software. In an attempt to clarify these characterizations, we introduce a methodology for applying a simple performance model based on Amdahl′s law. Our formulation results in accurate predictions of execution time on available systems, allowing programmers to select the optimal number of processors to apply to a particular problem or to select an appropriate problem size for the number of processors available. In short, we accurately quantify the scalability of a specific algorithm when it is run on a specific parallel computer. Our predictions are based on simple experiments that characterize machine performance and on a simple analysis of the parallel program. We illustrate our method for a program executed on a Sequent Symmetry multiprocessor with 20 processors. Our predictions closely match experimental results, differing by no more than 5% from the actual execution times. Our results illustrate key performance limitations of parallel systems, showing the impact of overhead and the scaling of problem size.  相似文献   

18.
This paper describes a debugger which uses the design artifacts of the Prometheus agent-oriented software engineering methodology to alert the developer testing the system, that a specification has been violated. Detailed information is provided regarding the error which can help the developer in locating its source. Interaction protocols specified during design, are converted to executable Petri net representations. The system can then be monitored at run time to identify situations which do not conform to specified protocols. A process for monitoring aspects of plan selection is also described. The paper then describes the Prometheus Design Tool, developed to support the Prometheus methodology, and presents a vision of an integrated development environment providing full life cycle support for the development of agent systems. The initial part of the paper provides a detailed summary of the Prometheus methodology and the artifacts on which the debugger is based.  相似文献   

19.
A model for representing and analyzing the design of a distributed software system is presented. The model is based on a modified form of Petri net, and enables one to represent both the structure and the behavior of a distributed software system at a desired level of design. Behavioral properties of the design representation can be verified by translating the modified Petri net into an equivalent ordinary Petri net and then analyzing that resulting Petri net. The model emphasizes the unified representation of control and data flows, partially ordered software components, hierarchical component structure, abstract data types, data objects, local control, and distributed system state. At any design level, the distributed software system is viewed as a collection of software components. Software components are externally described in terms of their input and output control states, abstract data types, data objects, and a set of control and data transfer specifications. They are interconnected through the shared control states and through the shared data objects. A system component can be viewed internally as a collection of subcomponents, local control states, local abstract data types, and local data objects.  相似文献   

20.
The realization of modern processors is based on a multicore architecture with increasing number of cores per processor. Multicore processors are often designed such that some level of the cache hierarchy is shared among cores. Usually, last level cache is shared among several or all cores (e.g., L3 cache) and each core possesses private low level caches (e.g., L1 and L2 caches). Superlinear speedup is possible for matrix multiplication algorithm executed in a shared memory multiprocessor due to the existence of a superlinear region. It is a region where cache requirements for matrix storage of the sequential execution incur more cache misses than in parallel execution. This paper shows theoretically and experimentally that there is a region, where the superlinear speedup can be achieved. We provide a theoretical proof of existence of a superlinear speedup and determine boundaries of the region where it can be achieved. The experiments confirm our theoretical results. Therefore, these results will have impact on future software development and exploitation of parallel hardware on the basis of a shared memory multiprocessor architecture. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号