共查询到20条相似文献,搜索用时 421 毫秒
1.
并行处理是当今计算技术的关键技术,也是新一代计算机的结构特征.我们从基本原理和实现技术两个方面对并行处理技术进行了研究.本文介绍了BJ-1并行计算机的设计原则、设计与实现、性能指标和性能测试结果. 相似文献
2.
3.
Istvn Dek 《Parallel Computing》1990,15(1-3):155-164
Almost all simulational computations require uniformly distributed random numbers. Generators of uniform random numbers are considered and assessed with respect to their possible use on parallel computers. Two recent, commercially available computers are given special attention: the Connection Machine and the T Series. Feedback shift register type generators with a large Mersenne prime are recommended for implementation on these computers. 相似文献
4.
Lixin Zhan 《Computer Physics Communications》2008,179(5):339-344
The Wang-Landau algorithm is a flat-histogram Monte Carlo method that performs random walks in the configuration space of a system to obtain a close estimation of the density of states iteratively. It has been applied successfully to many research fields. In this paper, we propose a parallel implementation of the Wang-Landau algorithm on computers of shared memory architectures by utilizing the OpenMP API for distributed computing. This implementation is applied to Ising model systems with promising speedups. We also examine the effects on the running speed when using different strategies in accessing the shared memory space during the updating procedure. The allowance of data race is recommended in consideration of the simulation efficiency. Such treatment does not affect the accuracy of the final density of states obtained. 相似文献
5.
《Computers & Structures》1987,26(4):551-559
The development of general-purpose finite element computer software systems has provided the capability to analyze a wide range of linear and non-linear structural problems. However, these software systems are severely limited for non-linear response calculations because of the available speed on current sequential computers. Recent and projected advances in parallel multiple instruction multiple data (MIMD) computers provide an opportunity for significant gains in computing speed and for broadening the range of structural problems which may be solved. The key to these gains is the effective selection and implementation of algorithms which exploit parallel computing. This paper documents experiences solving transient response calculations on an experimental MIMD computer, termed the Finite Element Machine. The paper describes the algorithm used, its implementation for parallel computations, and results for representative one- and two-dimensional dynamic response test problems. The results show computation speedups of up to 7.83 for eight processors, and indicate that significant speedups of solution time are possible for non-linear dynamic response calculations through the use of many processors and appropriate parallel integration algorithms. The results are extremely encouraging and suggest that significant speedups in structural computations can be achieved through advances in parallel computers. 相似文献
6.
《Journal of Parallel and Distributed Computing》2001,61(6):713-736
We describe and analyze parallelization techniques for the implementation of portable structured adaptive mesh applications on distributed memory parallel computers. Such methods are difficult to implement on parallel computers because they employ elaborate dynamic data structures to selectively capture localized irregular phenomena. Our infrastructure supports a set of layered abstractions that encapsulate low-level details of resource management, such as grid generation, interprocessor communication, and load balancing. Our layered design also provides the flexibility necessary to accommodate new applications and to fine-tune performance. This flexibility has enabled us to show that the uniformity restrictions imposed by a data parallel Fortran implementation (e.g., HPF) would significantly impact performance of structured adaptive mesh methods. We present computational results from eigenvalue computation arising in materials design. 相似文献
7.
We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers. 相似文献
8.
The parallel ‘Deutschland-Modell’ and its implementation on distributed memory parallel computers using the message-passing library PARMACS 6.0 is described. Performance results on a Cray T3D are given and the problem of dynamical load imbalances is addressed. 相似文献
9.
To efficiently perform morphological operations on neighborhood-processing-based parallel image computers, we need to decompose structuring elements larger than the neighborhood that can be directly handled into neighborhood subsets. In the special case that the structuring element is a convex polygon, there are known decomposition algorithms in the literature. In this paper, we give an algorithm for the optimal decomposition of arbitrarily shaped structuring elements, enabling an optimal implementation of morphological operations on neighborhood-connected parallel computers in the general case. 相似文献
10.
Ahmed K. Noor 《Engineering with Computers》1988,3(4):225-241
A brief review is made of the fundamental concepts and basic issues of parallel processing. Discussion focuses on mechanisms for parallel processing, construction and implementation of parallel numerical algorithms, performance evaluation of parallel processing machines and numerical algorithms, and parallelism in finite element computations. A novel partitioning strategy is outlined for maximizing the degree of parallelism on computers with a small number of powerful processors. 相似文献
11.
12.
《Computers & Structures》1987,25(3):395-403
The eigenvalue problem associated with structural vibration analysis is a major, computationally-intensive activity in large-scale finite element calculations. Advances in parallel computers together with appropriate solution methods have the potential for providing high-speed computational power to aid eigenvalue solutions for these large problems. The key to exploiting this potential is the development of appropriate methods tailored for such parallel computers. This paper reports on experiences from a study involving the implementation of the Lanczos method on a parallel computer. The results of this study show that introducing shifts, assigning each processor a different region in the eigenvalue spectrum, and implementing the Lanczos calculation steps in parallel is an effective strategy for speeding up calculations. This approach provides good parallel performance and easy balance of processor workload. Two example vibration problems were solved to assess the behavior of the Lanczos implementation. The test-problem results include examples of the Lanczos phenomenon where lack of orthogonality in the vectors can result in spurious eigenvalues. Tests were incorporated in the parallel calculations which detected these spurious eigenvalues. The parallel eigenvalue algorithm demonstrates that significant speedups in calculation time can be realized over traditional sequential methods. 相似文献
13.
14.
Per Brinch Hansen 《Software》1989,19(6):579-592
Joyce is a programming language for parallel computers based on CSP and Pascal. A Joyce program defines concurrent agents which communicate through unbuffered channels. This paper describes a multiprocessor implementation of Joyce. 相似文献
15.
Shingo Kurose Kunihito Yamamori Masaru Aikawa Ikuo Yoshihara 《Artificial Life and Robotics》2012,16(4):533-536
An island model is a typical implementation of genetic programming on parallel computers with distributed memory. The island
model has a migration facility that sends/receives some individuals in an island to/from another island to maintain diversity.
The island model requires synchronization to migrate same-generation individuals between islands, and this synchronization
causes an increase in computation time. This article proposes a new parallel genetic programming implementation based on the
island model with asynchronous migration. Most recent computers are equipped with one or more multi-core processors, and are
suitable for multi-threading. Therefore we employ a communication thread for migration between islands. The communication
thread on a processor communicates with the communication thread on another processor to migrate individuals at appropriate
intervals. Since the migration and other genetic operations can be independently processed on each core, and since we allow
the exchange of individuals of different generations, no synchronization is needed in our implementation. In addition, a fitness
calculation is also executed in parallel by the remaining cores. Experimental results show that the proposed method can reduce
the computation time to about 17% in serial GP by using 40 threads. 相似文献
16.
In this paper, we present an efficient parallel algorithm to solve Toeplitz–block and block–Toeplitz systems in distributed memory multicomputers. This algorithm parallelizes the Generalized Schur Algorithm to obtain the semi-normal equations. Our parallel implementation reduces the communication cost and optimizes the memory access. The experimental analysis on a cluster of personal computers shows the scalability of the implementation. The algorithm is portable because it is based on standard tools and libraries, such as ScaLAPACK and MPI. 相似文献
17.
This paper develops algorithms for filtering and smoothing for parallel computers. Numerical results are presented and implementation details are discussed. In the example it is illustrated that parallel methods have better convergence properties than nonparallel methods for nonlinear problems. 相似文献
18.
并行油藏模拟软件的实现及在国产高性能计算机上的应用 总被引:5,自引:0,他引:5
主要介绍了百万网格点规模的精细油藏数值模拟在国产高性能并行计算机与微机机群系统上的应用情况 .针对若干组来自于国内油田的百万网格点实际数据 ,给出了在多种国产并行机环境下的运行结果 ,并作了分析与评价 .在此基础上 ,讨论并行油藏数值模拟软件高效实现过程中遇到的关键技术 ,探讨大型软件并行化过程中经常遇到的瓶颈问题及改进方案 相似文献
19.
本文介绍了一种基于并行处理的语音生成工具之设计实现,该工具可用来支持多媒体技术,各种有声软件,以及语音库的运行,其工作环境为IBM-PC系列微型计算机及其兼容机。 相似文献