期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Tradeoffs in granularity and parallelization for a Monte Carlo shower simulation code

Kenichi Miura 《Parallel Computing》1988,8(1-3):91-100

The EGS4 code, developed at Stanford Linear Accelerator Center, simulates electron-photon cascading phenomena. The original code is inherently sequential: processing one particle at a time. This paper reports on a series of experiments in parallelizing different versions of EGS4. Our parallel experiments were run on a 30-processor Sequent Balance B21 and a 6-processor Symmetry S27. We have considered the following approaches for parallel execution of this application code:

1. (1) Original sequential version modified for parallel processing: 1 processor;
2. (2) Version 1 run multiprocessed: 1 to 29 processors;
3. (3) Sequential version modified for large-grain parallel processing: 1 procssor;
4. (4) Version 3 run using the Sequent Microtasking Library: 1 to 29 processors.

For each approach, we discuss the relative advantages and disadvantages in the areas of coding effort, understandability and portability, as well as performance, and outline a new parallelization approach we are currently pursuing based on Large-Grain Data Flow techniques. 相似文献

2.

Vector-supercomputers

F. Hossfeld 《Parallel Computing》1988,7(3):373-385

Today, the field of high-speed computers and supercomputing applications is dominated by the vector-processor architecture. This paper gives a survey on the architectural principles of vector computers like segmentation, pipelining, and chaining as well as on the spectrum of real systems available in the market. It illuminates the potentiality and the limitations of vectorization strategies. Recent developments towards multi-vectorcomputer systems give impact to new supercomputing concepts balancing vectorization versus parallel computation by exploiting multitasking principles. Covering a wide spectrum of applications vector-supercomputers are making relevant contributions to the progress in scientific research and technology. 相似文献

3.

Task granularity studies on a many-processor CRAY X-MP

D.A. Calahan 《Parallel Computing》1985,2(2):109-118

A hybrid granularity model is proposed for general concurrent solution. It is applied to the triangular factorization of a dense matrix ranging in size from 4 to 1024. Concurrency is achieved at two levels: (1) with small (micro) task granularity and (2) with large (blocked) task granularity. Relevance to a many-processor CRAY X-MP is demonstrated by simulation. 相似文献

4.

Microtasking as a complement to macrotasking

D. Dent

M. O'Neill 《Parallel Computing》1988,8(1-3):149-154

The ECMWF weather model runs daily as a time critical application. Acceptable elapsed times are achieved by multitasking the code on a CRAY X-MP/48. This is done at a high level giving rise to large tasks. Investigations have been carried out to tackle inefficiencies by microtasking at a low level so that the code can take advantage of any idle processors which may become available. 相似文献

5.

Cedar Fortran and other vector and parallel Fortran dialects

Mark D. Guzzi David A. Padua Jay Hoeflinger Duncan H. Lawrie 《The Journal of supercomputing》1990,4(1):37-62

相似文献

6.

Analysis tools for Micro- and Autotasking programs on CRAY multiprocessor systems

Yves Escaig Wilfried Oed 《Parallel Computing》1991,17(12):1425-1433

Requirements for tools analyzing the performance of parallel programs with respect to parallel and sequential parts, overhead, and load balance, as well as available tools for programs parallelized with Cray Microtasking or Autotasking are described. 相似文献

7.

Multitasking: Experience with applications on a CRAY X-MP

F. Hossfeld R. Knecht W. E. Nagel 《Parallel Computing》1989,12(3):259-283

On the multiprocessor vector-supercomputer CRAY X-MP, parallelism—beyond vectorization—can be exploited on the programming language level by two multitasking strategies: macrotasking and, more recently, microtasking. In this paper, multitasking results and experiences are presented which have been gained by applying these two implemented modes to linear-algebra and non-numerical algorithms as well as to a large fluid-flow simulation code. While comparing the concepts and realizations of macrotasking and microtasking, the features, tools, and problems of multitasking programming and the potential user benefit of these parallel processing techniques are discussed. 相似文献

8.

Computations with symmetric, positive definite and band matrices on a parallel vector processor

Zahari Zlatev 《Parallel Computing》1988,8(1-3):301-312

Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops are vectorized but the speed of computations, measured in Mflops, is not very high, because the arrays involved are normally short. Therefore the computations should be organized by diagonals. Furthermore, some special devices are to be applied in order to unrol the loops. Finally, one should be careful with the storage scheme. It is demonstrated that if (i) the computations are organized by diagonals, (ii) the main loops are unrolled and (iii) the storage scheme is such that the work with some zero-elements is avoided, then the speed of computations is nearly the same as that obtained in the computations with dense matrices. If a particular vector machine is in use (in our case a CRAY X-MP computer), then the speed can be increased further by (iv) coding some basic operations in machine language and (v) using the different processors of the vector computer in parallel. The efficiency of the exploitation of the special features of the particular computer that is to be used is also illustrated by numerical examples.

Kernel subroutines performing matrix-vector multiplications are described. Representative tests are used to demonstrate the efficiency of these kernels. 相似文献

9.

Using multiple CPUs for problem solving: Experiences in multitasking on the CRAY X-MP/48

Wolfgang E. Nagel 《Parallel Computing》1988,8(1-3):223-230

The integration of vector computers into multiprocessor configurations allows the use of multiple high-speed processors in parallel for one program. There are two aspects which are considered to be important for an efficient use of multiprocessor configurations. First, the flexibility, speed and user friendliness of the available synchronization and communication primitives, and second, the user problems in detecting data dependencies and in translating programs correctly into the parallel form required by the system. This paper is intended to give an overview of our experiences in multitasking using up to four CPUs of a CRAY X-MP/48. The results gained by macrotasking and microtasking will be compared for program kernels and real-life application programs. Special attention is paid to the difficulties of using more than two CPUs in parallel. 相似文献