期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Homomorphisms between models of parallel computation

Takumi Kasai Raymond E. Miller 《Journal of Computer and System Sciences》1982,25(3):285-331

To better understand the relationships between different models of parallel computation, we introduce a new computation system formulation and develop general notions of homomorphisms and isomorphisms between computation systems. This allows us to study relations between vector addition systems, vector replacement systems, Petri nets, and generalized Petri nets. Results in this paper that may be of particular interest include a long list of properties preserved under homomorphism, and constructions that show that vector replacement systems can be simulated by vector addition systems, and that generalized Petri nets can be emulated by Petri nets. 相似文献

2.

Asynchronous and implicitly parallel evolutionary computation models

Domagoj Jakobović Marin Golub Marko Čupić 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2014,18(6):1225-1236

This paper presents the design and the application of asynchronous models of parallel evolutionary algorithms. An overview of the existing parallel evolutionary algorithm (PEA) models and available implementations is given. We present new PEA models in the form of asynchronous algorithms and implicit parallelization, as well as experimental data on their efficiency. The paper also discusses the definition of speedup in PEAs and proposes an appropriate speedup measurement procedure. The described parallel EA algorithms are tested on problems with varying degrees of computational complexity. The results show good efficiency of asynchronous and implicit models compared to existing parallel algorithms. 相似文献

3.

Architecture-independent parallel computation

Skillicorn D.B. 《Computer》1990,23(12):38-50

The major parallel architecture classes are considered: single-instruction multiple-data (SIMD) computers, tightly coupled multiple-instruction multiple-data (MIMD) computers, hypercuboid computers and constant-valence MIMD computers. An argument that the PRAM model is universal over tightly coupled and hypercube systems, but not over constant-valence-topology, loosely coupled-system is reviewed, showing precisely how the PRAM model is too powerful to permit broad universality. Ways in which a model of computation can be restricted to become universal over less powerful architectures are discussed. The Bird-Meertens formalism (R.S. Bird, 1989), is introduced and it is shown how it is used to express computations in a compact way. It is also shown that the Bird-Meertens formalism is universal over all four architecture classes and that nontrivial restrictions of functional programming languages exist that can be efficiently executed on disparate architectures. The use of the Bird-Meertens formalism as the basis for a programming language is discussed, and it is shown that it is expressive enough to be used for general programming. Other models and programming languages with architecture-independent properties are reviewed 相似文献

4.

Limit dynamics for stochastic models of data exchange in parallel computation networks

A. G. Malyshkin 《Problems of Information Transmission》2006,42(3):234-250

We study limit dynamics of a system of interacting particles, which is one of possible models for the parallel and distributed computation process. For a rather wide class of multi-particle interactions, we prove that the stochastic process describing the configuration of a particle system weakly converges in the fluid-dynamic limit to a deterministic process, which is a solution of a certain partial differential equation. 相似文献

5.

Models for practical parallel computation

D. B. Skillicorn 《International journal of parallel programming》1991,20(2):133-158

A major reason for the lack of practical use of parallel computers has been the absence of a suitable model of parallel computation. Many existing models are either theoretical or are tied to a particular architecture. A more general model must be architecture independent, must realistically reflect execution costs, and must reduce the cognitive overhead of managing massive parallelism. A growing number of models meeting some of these goals have been suggested. We discuss their properties and relative strengths and weaknesses. We conclude that data parallelism is a style with much to commend it, and discuss the Bird-Meertens formalism as a coherent approach to data parallel programming.This work was supported by the Natural Sciences and Engineering Research Council of Canada. 相似文献

6.

Semantics for data parallel computation

Michael D. Rice 《International journal of parallel programming》1990,19(6):477-509

相似文献

7.

Massively parallel computation of conservation laws

Marc Garbey David Levine 《Parallel Computing》1990,16(2-3):293-304

We present a new method for computing solutions of conservation laws based on the use of cellular automata with the method of characteristics. The method exploits the high degree of parallelism available with cellular automata and retains important features of the method of characteristics. It yields high numerical accuracy and extends naturally to adaptive meshes and domain decomposition methods for perturbed conservation laws. We describe the method and its implementation for a Dirichlet problem with a single conservation law for the one-dimensional case.

Numerical results for the one-dimensional law with the classical Burgers nonlinearity or the Buckley-Leverett equation show good numerical accuracy outside the neighborhood of the shocks. The error in the area of the shocks is of the order of the mesh size. The algorithm is well suited for execution on both massively parallel computers and vector machines. We present timing results for an Alliant FX/8, Connection Machine Model 2, and CRAY X-MP. 相似文献

8.

Algebraic-grammatical approach to interface design

V. G. Tul'chinskii 《Cybernetics and Systems Analysis》1996,32(5):732-740

相似文献

9.

Semi-automatic process partitioning for parallel computation

Charles Koelbel Piyush Mehrotra John Van Rosendale 《International journal of parallel programming》1987,16(5):365-382

Automatic process partitioning is the operation of automatically rewriting an algorithm as a collection of tasks, each operating primarily on its own portion of the data, to carry out the computation in parallel. Hybrid shared memory systems provide a hierarchy of globally accessible memories. To achieve high performance on such machines one must carefully distribute the work and the data so as to keep the workload balanced while optimizing the access to nonlocal data. In this paper we consider a semi-automatic approach to process partitioning in which the compiler, guided by advice from the user, automatically transforms programs into such an interacting set of tasks. This approach is illustrated with a picture processing example written in BLAZE, which is transformed by the compiler into a task system maximizing locality of memory reference.Research supported by an IBM Graduate Fellowship.Research supported under NASA Contract No. 520-1398-0356.Research supported by NASA Contract No. NAS1-18107 while the last two authors were in residence at ICASE, NASA, Langley Research Center. 相似文献

10.

The universal space for parallel computation

R. Zuczek 《Information Processing Letters》1977,6(2):42-45

相似文献

11.

Linguistic models and computation

V. R. Aleev V. N. Mansurov 《Cybernetics and Systems Analysis》1979,15(5):641-645

相似文献

12.

An efficient implementation of parallel eigenvalue computation for massively parallel processing 总被引：4，自引：0，他引：4

Takahiro Katagiri Yasumasa Kanada 《Parallel Computing》2001,27(14):1831-1845

This paper describes an efficient implementation and evaluation of a parallel eigensolver for computing all eigenvalues of dense symmetric matrices. Our eigensolver uses a Householder tridiagonalization method, which has higher parallelism and performance than conventional methods when problem size is relatively small, e.g., the order of 10,000. This is very important for relevant practical applications, where many diagonalizations for such matrices are required so often. The routine was evaluated on the 1024 processors HITACHI SR2201, and giving speedup ratios of about 2–5 times as compared to the ScaLAPACK library on 1024 processors of the HITACHI SR2201. 相似文献

13.

近程作用分子动力学模拟的两级并行

王小伟郭力杨章远《计算机与应用化学》2003,20(5):639-642

分子动力学作为一种重要的计算手段在许多领域有着广泛的应用,由于它的计算量比较庞大,因此并行计算方法被越来越多地引入到分子动力学的模拟中。本文在目前常见的SMP集群系统上,根据系统的结构特点,针对分子动力学的三种并行算法：区域分解法、原子分解法和力分解法,利用MPI Pthread的混合编程模型,采用节点间消息传递模式以及节点内部共享存储的编程模式,实现了近程作用分子动力学的两级并行计算。计算结果表明,不同的算法采用了两级并行的方式和原来只有消息传递的并行方式相比,具有不同的计算效率,但是从总体来说采用两级并行的计算方式可以利用更多的计算资源,从而有助于提高计算能力。相似文献

14.

Mathematical models of quantum computation

Tetsuro Nishino 《New Generation Computing》2002,20(4):317-337

In this paper, we introduce two mathematical models of realistic quantum computation. First, we develop a theory of bulk quantum computation such as NMR (Nuclear Magnetic Resonance) quantum computation. For this purpose, we define bulk quantum Turing machine (BQTM for short) as a model of bulk quantum computation. Then, we define complexity classes EBQP, BBQP and ZBQP as counterparts of the quantum complexity classes EQP, BQP and ZQP, respectively, and show that EBQP=EQP, BBQP=BQP and ZBQP=ZQP. This implies that BQTMs are polynomially related to ordinary QTMs as long as they are used to solve decision problems. We also show that these two types of QTMs are also polynomially related when they solve a function problem which has a unique solution. Furthermore, we show that BQTMs can solve certain instances of NP-complete problems efficiently. On the other hand, in the theory of quantum computation, only feed-forward quantum circuits are investigated, because a quantum circuit represents a sequence of applications of time evolution operators. But, if a quantum computer is a physical device where the gates are interactions controlled by a current computer such as laser pulses on trapped ions, NMR and most implementation proposals, it is natural to describe quantum circuits as ones that have feedback loops if we want to visualize the total amount of the necessary hardware. For this purpose, we introduce a quantum recurrent circuit model, which is a quantum circuit with feedback loops. LetC be a quantum recurrent circuit which solves the satisfiability problem for a blackbox Boolean function includingn variables with probability at least 1/2. And lets be the size ofC (i.e. the number of the gates inC) andt be the number of iterations that is needed forC to solve the satisfiability problem. Then, we show that, for those quantum recurrent circuits, the minimum value ofmax(s, t) isO(n ²2^n/3). Tetsuro Nishino, D.Sc.: He is presently an Associate Professor in the Department of Information and Communication Engineering, The University of Electro-Communications. He received the B.S., M.S. and D.Sc degrees in mathematics from Waseda University, in 1982, 1984 and 1991 respectively. From 1984 to 1987, he joined Tokyo Research Laboratory, IBM Japan. From 1987 to 1992, he was a Research Associate of Tokyo Denki University, and from 1992 to 1994, he was an Associate Professor of Japan Advanced Institute of Science and Technology, Hokuriku. His main interests are circuit complexity theory, computational learning theory and quantum complexity theory. 相似文献

15.

Efficient parallel algorithms for linear recurrence computation

Albert C. Greenberg Richard E. Ladner Michael S. Paterson Zvi Galil 《Information Processing Letters》1982,15(1):31-35

相似文献

16.

Reconfigurable hardware solution to parallel prefix computation

Jin Hwan Park H. K. Dai 《The Journal of supercomputing》2008,43(1):43-58

This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(N−d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.

H. K. DaiEmail:

相似文献

17.

On parallel computation of centrality measures of graphs

García Juan F. Carriegos M. V. 《The Journal of supercomputing》2019,75(3):1410-1428

Centrality measures or indicators of centrality identify most relevant nodes of graphs. Although optimized algorithms exist for computing of most of them, they are still time consuming and are even infeasible to apply to big enough graphs like the ones representing social networks or extensive enough computer networks. In this paper, we present a parallel implementation in C language of some optimal algorithms for computing of some indicators of centrality. Our parallel version greatly reduces the execution time of their sequential (non-parallel) counterpart. The proposed solution relies on threading, allowing for a theoretical improvement in performance close to the number of logical processors (cores) of the single computer in which it is running. Our software has been tested in several platforms, including the supercomputer Calendula, in which we achieved execution times close to 18 times faster when running our parallel implementation instead of our sequential one. Our solution is multi-platform and portable, working on any machine with several logical processor which is capable of compiling and running C language code.

相似文献

18.

The parallel computation of Racah coefficients using transputers

《Computer Physics Communications》1987,46(1):83-98

相似文献

19.

Language realization of a parallel asynchronic computation model

T. I. Lel'chuk 《Cybernetics and Systems Analysis》1984,20(5):651-659

相似文献

20.

Comparing models of higher type computation

G. Rosolini Th. Streicher 《Electronic Notes in Theoretical Computer Science》1999,23(1):159

相似文献