首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   9篇
  免费   0篇
自动化技术   9篇
  1991年   1篇
  1990年   1篇
  1989年   1篇
  1988年   5篇
  1985年   1篇
排序方式: 共有9条查询结果,搜索用时 0 毫秒
1
1.
The EGS4 code, developed at Stanford Linear Accelerator Center, simulates electron-photon cascading phenomena. The original code is inherently sequential: processing one particle at a time. This paper reports on a series of experiments in parallelizing different versions of EGS4. Our parallel experiments were run on a 30-processor Sequent Balance B21 and a 6-processor Symmetry S27. We have considered the following approaches for parallel execution of this application code:
1. (1) Original sequential version modified for parallel processing: 1 processor;
2. (2) Version 1 run multiprocessed: 1 to 29 processors;
3. (3) Sequential version modified for large-grain parallel processing: 1 procssor;
4. (4) Version 3 run using the Sequent Microtasking Library: 1 to 29 processors.

For each approach, we discuss the relative advantages and disadvantages in the areas of coding effort, understandability and portability, as well as performance, and outline a new parallelization approach we are currently pursuing based on Large-Grain Data Flow techniques.  相似文献   

2.
Today, the field of high-speed computers and supercomputing applications is dominated by the vector-processor architecture. This paper gives a survey on the architectural principles of vector computers like segmentation, pipelining, and chaining as well as on the spectrum of real systems available in the market. It illuminates the potentiality and the limitations of vectorization strategies. Recent developments towards multi-vectorcomputer systems give impact to new supercomputing concepts balancing vectorization versus parallel computation by exploiting multitasking principles. Covering a wide spectrum of applications vector-supercomputers are making relevant contributions to the progress in scientific research and technology.  相似文献   
3.
A hybrid granularity model is proposed for general concurrent solution. It is applied to the triangular factorization of a dense matrix ranging in size from 4 to 1024. Concurrency is achieved at two levels: (1) with small (micro) task granularity and (2) with large (blocked) task granularity. Relevance to a many-processor CRAY X-MP is demonstrated by simulation.  相似文献   
4.
The ECMWF weather model runs daily as a time critical application. Acceptable elapsed times are achieved by multitasking the code on a CRAY X-MP/48. This is done at a high level giving rise to large tasks. Investigations have been carried out to tackle inefficiencies by microtasking at a low level so that the code can take advantage of any idle processors which may become available.  相似文献   
5.
6.
Requirements for tools analyzing the performance of parallel programs with respect to parallel and sequential parts, overhead, and load balance, as well as available tools for programs parallelized with Cray Microtasking or Autotasking are described.  相似文献   
7.
On the multiprocessor vector-supercomputer CRAY X-MP, parallelism—beyond vectorization—can be exploited on the programming language level by two multitasking strategies: macrotasking and, more recently, microtasking. In this paper, multitasking results and experiences are presented which have been gained by applying these two implemented modes to linear-algebra and non-numerical algorithms as well as to a large fluid-flow simulation code. While comparing the concepts and realizations of macrotasking and microtasking, the features, tools, and problems of multitasking programming and the potential user benefit of these parallel processing techniques are discussed.  相似文献   
8.
Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops are vectorized but the speed of computations, measured in Mflops, is not very high, because the arrays involved are normally short. Therefore the computations should be organized by diagonals. Furthermore, some special devices are to be applied in order to unrol the loops. Finally, one should be careful with the storage scheme. It is demonstrated that if (i) the computations are organized by diagonals, (ii) the main loops are unrolled and (iii) the storage scheme is such that the work with some zero-elements is avoided, then the speed of computations is nearly the same as that obtained in the computations with dense matrices. If a particular vector machine is in use (in our case a CRAY X-MP computer), then the speed can be increased further by (iv) coding some basic operations in machine language and (v) using the different processors of the vector computer in parallel. The efficiency of the exploitation of the special features of the particular computer that is to be used is also illustrated by numerical examples.

Kernel subroutines performing matrix-vector multiplications are described. Representative tests are used to demonstrate the efficiency of these kernels.  相似文献   

9.
The integration of vector computers into multiprocessor configurations allows the use of multiple high-speed processors in parallel for one program. There are two aspects which are considered to be important for an efficient use of multiprocessor configurations. First, the flexibility, speed and user friendliness of the available synchronization and communication primitives, and second, the user problems in detecting data dependencies and in translating programs correctly into the parallel form required by the system. This paper is intended to give an overview of our experiences in multitasking using up to four CPUs of a CRAY X-MP/48. The results gained by macrotasking and microtasking will be compared for program kernels and real-life application programs. Special attention is paid to the difficulties of using more than two CPUs in parallel.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号