首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In the original failure-divergence semantic model for Communicating Sequential Processes (CSP), the incomplete treatment of successful process termination, and in particular parallel termination, permitted unnatural processes to be defined. In response to these problems, a number of different solutions have been proposed by various authors since the original failure-divergence model was developed by Hoare, Brookes and Roscoe. This paper presents an alternative solution to this problem, which is both closer to the original semantic model and provides greater flexibility over the type of parallel termination semantics available in CSP.  相似文献   

2.
A simple and efficient parallel FFT algorithm using the BSP model   总被引:1,自引:0,他引:1  
We present a new parallel radix-4 FFT algorithm based on the BSP model. Our parallel algorithm uses the group-cyclic distribution family, which makes it simple to understand and easy to implement. We show how to reduce the communication cost of the algorithm by a factor of 3, in the case that the input/output vector is in the cyclic distribution. We also show how to reduce computation time on computers with a cache-based architecture. We present performance results on a Cray T3E with up to 64 processors, obtaining reasonable efficiency levels for local problem sizes as small as 256 and very good efficiency levels for local sizes larger than 2048.  相似文献   

3.
In the literature, there are quite a few sequential and parallel algorithms to solve problems on distance-hereditary graphs. Two well-known classes of graphs, which contain trees and cographs, belong to distance-hereditary graphs. We consider the vertex-coloring problem on distance-hereditary graphs. Let T/sub d/(|V|, |E|) and P/sub d/d(|V|, |E|) denote the time and processor complexities, respectively, required to construct a decomposition tree representation of a distance-hereditary graph G=(V,E) on a PRAM model M/sub d/. Our algorithm runs in O(T/sub d/(|V|, |E|)+log|V|) time using O(P/sub d/(|V|, |E|)+|V|/log|V|) processors on M/sub d/. The best known result for constructing a decomposition tree needs O(log/sup 2/ |V|) time using O(|V|+|E|) processors on a CREW PRAM. If a decomposition tree is provided as input, we solve the problem in O(log |V|) time using O(|V|/log |V|) processors on an EREW PRAM. To the best of our knowledge, there is no parallel algorithm for this problem on distance-hereditary graphs.  相似文献   

4.
5.
GATE/Geant4 Monte Carlo simulations are computationally demanding applications, requiring thousands of processor hours to produce realistic results. The classical strategy of distributing the simulation of individual events does not apply efficiently for Positron Emission Tomography (PET) experiments, because it requires a centralized coincidence processing and large communication overheads. We propose a parallel computational model for GATE that handles event generation and coincidence processing in a simple and efficient way by decentralizing event generation and processing but maintaining a centralized event and time coordinator. The model is implemented with the inclusion of a new set of factory classes that can run the same executable in sequential or parallel mode. A Mann–Whitney test shows that the output produced by this parallel model in terms of number of tallies is equivalent (but not equal) to its sequential counterpart. Computational performance evaluation shows that the software is scalable and well balanced.  相似文献   

6.
The current informal semantics of the simple concurrent object-oriented programming (SCOOP) mechanism for Eiffel is described. We construct and discuss a model using the process algebra CSP. This model gives a more formal semantics for SCOOP than existed previously. We implement the model mechanically via a new tool called CSPsim. We examine two semantic variations of SCOOP: when and how far to pass locks, and when to wait for child calls to complete. We provide evidence that waiting for child calls to complete both unnecessarily reduces parallelism without any increase in safety and increases deadlocks involving callbacks. Through the creation and analysis of the model, we identify a number of ambiguities relating to reservations and the underlying run-time system and propose means to resolve them. M. J. Butler  相似文献   

7.
A simple mathematical model of laser drilling is proposed. Assuming axi-symmetry of the process around the axis of the laser beam, a one-dimensional formulation is obtained after cross-sectional averaging. The novelty of the approach relies on the fact that even after dimension reduction, the shape of the hole can still be described. The model is derived, implemented and validated for drilling using lasers with intensities in the GW/cm2 range and microsecond pulses.  相似文献   

8.
《Artificial Intelligence》2006,170(12-13):1031-1080
We address two aspects of constructing plans efficiently by means of satisfiability testing: efficient encoding of the problem of existence of plans of a given number t of time points in the propositional logic and strategies for finding plans, given these formulae for different values of t.For the first problem we consider three semantics for plans with parallel operator application in order to make the search for plans more efficient. The standard semantics requires that parallel operators are independent and can therefore be executed in any order. We consider a more relaxed definition of parallel plans which was first proposed by Dimopoulos et al., as well as a normal form for parallel plans that requires every operator to be executed as early as possible. We formalize the semantics of parallel plans emerging in this setting and present translations of these semantics into the propositional logic. The sizes of the translations are asymptotically optimal. Each of the semantics is constructed in such a way that there is a plan following the semantics exactly when there is a sequential plan, and moreover, the existence of a parallel plan implies the existence of a sequential plan with as many operators as in the parallel one.For the second problem we consider strategies based on testing the satisfiability of several formulae representing plans of n time steps for several values of n concurrently by several processes. We show that big efficiency gains can be obtained in comparison to the standard strategy of sequentially testing the satisfiability of formulae for an increasing number of time steps.  相似文献   

9.
《微型机与应用》2016,(7):57-59
在100G以太网媒体接入控制器(Media Access Control,MAC)的设计中,需要采用高位宽的并行数据来降低对时钟的要求。在使用并行循环冗余校验(Cyclical Redundancy Check,CRC)时会有一个问题,即需要计算CRC的数据域长度不一定是数据通道位宽的整数倍,导致最后一组数据无法使用数据通道的位宽对其进行CRC计算。为了解决这个问题,本文提出了在帧前填充0的处理方法。仿真和测试结果都验证了该方法的可行性。该处理方法也能应用到其他的通信系统中。  相似文献   

10.
The paper describes a parallel implementation of a grand challenge problem: global atmospheric modeling. The novel contributions of our work include (1) a detailed investigation of opportunities for parallelism in atmospheric global modeling based on spectral solution methods, (2) the experimental evaluation of overheads arising from load imbalances and data movement for alternative parallelization methods, and (3) the development of a parallel code that can be monitored and steered interactively based on output data visualizations and animations of program functionality or performance. Code parallelization takes advantage of the relative independence of computations at different levels in the earth's atmosphere, resulting in parallelism of up to 40 processors, each independently performing computations for different atmospheric levels and requiring few communications between different levels across model time steps. Next, additional parallelism is attained within each level by taking advantage of the natural parallelism offered by the spectral computations being performed (e.g. taking advantage of independently computable terms in equations). Performance measurements are performed on a 64-node KSR2 supercomputer. However, the parallel code has been ported to several shared memory parallel machines, including SGI multiprocessors, and has also been ported to distributed memory platforms like the IBM SP-2.  相似文献   

11.
The main contribution of this work is to offer a simple and cost-efficient parallel algorithm that, given an arbitrary n-vertex cubic graph G as input, produces an orthogonal grid drawing of G in O(log n) time, using n processors on an EREW PRAM. Our algorithm matches the time and cost performance of the best previously-known algorithm while at the same time improving the constant factors involved in two important metrics: layout area and number of bends. More importantly, however, our algorithm stands out by its conceptual simplicity and ease of implementation.  相似文献   

12.
The process targeting problem is usually formulated as a single objective optimization model. In this paper, a multi-objective optimization model is developed for the process targeting problem. The process under consideration produces a product with a normally distributed quality characteristic with unknown mean and known variance. The quality characteristic has a lower specification limit. The quality of the product is controlled via lot-by-lot acceptance sampling. The objectives used in the model are to maximize profit, income and product uniformity using the Taguchi quadratic loss function as a surrogate for product uniformity. An algorithm is proposed to obtain and rank the set of Pareto optimal points. The utility of the model is demonstrated using a numerical example from the literature. Sensitivity analysis on the model parameters showed that the results of the model are sensitive to changes in process variance. In addition, the optimal objectives of the profit function and product uniformity are more sensitive to changes in model parameters than the income function.  相似文献   

13.
Aquatic plants have become a problem in many freshwater bodies. Several control techniques are available, but mechanical methods (eg. harvesting) have advantages in some circumstances. Since the effectiveness of mechanical methods has been variable, a model has been formulated to investigate the impact on the plants of several harvesting regimes. The model reproduces the observed dynamics of unperturbed growth very well, and shows reasonable dynamics for harvested plants. Results from the model suggest that July is the optimum time to harvest, but that harvesting cannot be viewed as more than a temporary relief to the plant problem.  相似文献   

14.
This paper presents tuple channel model (TCM), a new coordination model for parallel and distributed programming. Our proposal is based on the use of tuple channels (TCs) to model the communication and synchronization of different activities. TCs are multi-point channels that allow complex data structures to be communicated among multiple producers and consumers. This communication model allows incremental and backward communication to be expressed, providing an elegant way of implicit and direct communication and reactive control. TCs can be dynamically interconnected through the use of user-defined connectors, providing great flexibility for the definition of complex and dynamic interaction protocols. TCM also provides a simple service management mechanism, by means of which open systems can be implemented in an appropriate way. The suitability, expressiveness and programming techniques of the model are presented by means of some illustrative examples. In addition, some implementation details of the developed prototypes are sketched and we show the preliminary results demonstrating the efficiency of the proposal.  相似文献   

15.
16.
This paper focuses on parallel interactive applications ranging from scientific visualization, to virtual reality or computational steering. Interactivity makes them particular on three main aspects: they are endlessly iterative, use advanced I/O devices, and must perform under strong performance constraints (latency, refresh rate). A data flow graph is a common approach to describe such applications. Edges represent data streams while vertices are nodes processing incoming data streams and producing new data streams. When applications become large, this approach shows its limits in terms of maintainability and portability. In this paper, we propose to use the composite design pattern to extend this model for supporting hierarchies of components. The component hierarchy is traversed to instantiate the application and extract the data flow graph required for the execution. This approach has been implemented for the FlowVR middleware. It enables to define parametric composite components, commonly called skeletons, that can be reused in various applications. This approach proved to significantly leverage application modularity as presented in different case studies.  相似文献   

17.
To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provide room for performance enhancement. We have applied this approach to a real DMI case in the life sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.  相似文献   

18.
We consider a simple priority queueing system with two different types of traffic, high and low priority. Each type of traffic generates arrivals to its own buffer that are modeled by continuous fluid flows. The input flow into the high priority buffer is Markov modulated, while the input rate into the low priority buffer is constant. The fluid is transmitted from the buffers with the high priority buffer having priority over the low priority buffer. The problem for the joint distribution of the buffer contents is derived and solved explicitly. Approximations are constructed for the tail behavior of the buffer content  相似文献   

19.
The computational complexity of a parallel algorithm depends critically on the model of computation. We describe a simple and elegant rule-based model of computation in which processors apply rules asynchronously to pairs of objects from a global object space. Application of a rule to a pair of objects results in the creation of a new object if the objects satisfy the guard of the rule. The model can be efficiently implemented as a novel MIMD array processor architecture, the Intersecting Broadcast Machine. For this model of computation, we describe an efficient parallel sorting algorithm based on mergesort. The computational complexity of the sorting algorithm isO(nlog2 n), comparable to that for specialized sorting networks and an improvement on theO(n 1.5) complexity of conventional mesh-connected array processors.  相似文献   

20.
A predicate/transition net model for a subset of Horn clause logic programs is presented. The syntax, transformation procedure, semantics, and deduction process for the net model are discussed. A possible parallel implementation for the net model is described, which is based on the concepts of communicating processes and relations. The proposed net model offers a syntactical variant of Horn clause logic and has two distinctions from other existing schemes for the logic programs: representation formalism and the deduction method. The net model provides an approach towards the solutions of the separation of logic from control and the improvement of the execution efficiency through parallel processing for the logic programs. The abstract nature of the net model also lends itself to different implementation strategies.<>  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号