首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
非定常Monte Carlo输运问题的并行算法   总被引:1,自引:0,他引:1  
文中给出了非定常MonteCarlo(下文简写为MC)输运问题的并行算法 ,对并行程序的加载运行模式进行了讨论和优化设计 .针对MC并行计算设计了一种理想情况下无通信的并行随机数发生器算法 .动态MC输运问题有大量的I/O操作 ,特别是读取剩余粒子数据文件需要大量的I/O时间 ,文中针对I/O问题 ,提出了三种并行I/O算法 .最后给出了并行算法的性能测试结果 ,对比串行计算时间 ,使用 6 4台处理机时的并行计算时间缩短了 30倍  相似文献   

2.
This paper proposes two viable computing strategies for distributed parallel systems: domain division with sub-domain overlapping and asynchronous communication. We have implemented a parallel computing procedure for simulation of Ti thin film growing process of a system with 1000 x 1000 atoms by means of the Monte Carlo (MC) method. This approach greatly reduces the computation time for simulation of large-scale thin film growth under realistic deposition rates. The multi-lattice MC model of deposition comprises two basic events: deposition, and surface diffusion. Since diffusion constitutes more than 90% of the total simulation time of the whole deposition process at high temperature, we concentrated on implementing a new parallel diffusion simulation that reduces communication time during simulation. Asynchronous communication and domain overlapping techniques are used to reduce the waiting time and communication time among parallel processors. The parallel algorithms we propose can simulate the thin  相似文献   

3.
提出一种按照计算域分解的并行化方法来构建等几何分析的刚度矩阵和右侧向量.将计算域分解成为若干个不相交的子区域,然后为每个区域分配一个处理器,所有处理器并行进行子区域上面的计算,所有处理器完成子区域的计算以后,使用一个快速的归并算法完成线性系统的装配.实验表明,本文提出的方法在8核的机器上可以达到6.46的加速比,能够在4秒左右的时间计算680万个矩阵元素个数.使用Intel MKL稀疏求解器来求解线性系统,本文的等几何分析求解器能够在大约10秒的时间内求解52万的自由度,本文的方法比ISOGAT速度要快上万倍.  相似文献   

4.
在生物自发光成像领域,将基于蒙特卡罗方法的光子前向传输仿真进行并行化,提高了仿真的速度。首先介绍了所采用的一系列并行机制和串行加速算法,然后分别对并行仿真结果进行正确性验证和性能验证,并与软件MOSE、triMC3D的结果进行了对比,最后对该并行平台进行了总结和展望。  相似文献   

5.
区域分解对气象模式并行计算速度的影响   总被引:2,自引:1,他引:1       下载免费PDF全文
通过数值试验分析了区域分解策略对ARPS气象模式并行计算速度的影响,发现无论是否使用编译优化技术,均以分解后数据区域近似为正方形时具有最大的加速比和并行效率。在二级编译优化的情况下,并行速度还和分解方向有关,在y方向上的分解比在x方向上的分解更有利于提高并行效率,而在无优化情况下,并行速度和分解方向几乎无关。并从通信量和编译优化的角度对试验结果进行了讨论和分析。  相似文献   

6.
在分布式存储结构的机群系统上,采用可移植消息传递接口MPI与C语言绑定,设计并实现了并行蒙特卡罗算法.有效解决了计算量大、串行算法执行时间长的问题。通过对机群节点间通信时间开销的研究分析.采用主从式编程模型改进并行蒙特卡罗算法,实现了负载平衡,提高了机群处理器的利用率,进一步缩短了执行时间。  相似文献   

7.
《Parallel Computing》2014,40(10):646-660
Monte Carlo (MC) neutral particle transport codes are considered the gold-standard for nuclear simulations, but they cannot be robustly applied to high-fidelity nuclear reactor analysis without accommodating several terabytes of materials and tally data. While this is not a large amount of aggregate data for a typical high performance computer, MC methods are only embarrassingly parallel when the key data structures are replicated for each processing element, an approach which is likely infeasible on future machines. The present work explores the use of spatial domain decomposition to make full-scale nuclear reactor simulations tractable with Monte Carlo methods, presenting a simple implementation in a production-scale code. Good performance is achieved for mesh-tallies of up to 2.39 TB distributed across 512 compute nodes while running a full-core reactor benchmark on the Mira Blue Gene/Q supercomputer at the Argonne National Laboratory. In addition, the effects of load imbalances are explored with an updated performance model that is empirically validated against observed timing results. Several load balancing techniques are also implemented to demonstrate that imbalances can be largely mitigated, including a new and efficient way to distribute extra compute resources across finer domain meshes.  相似文献   

8.
气象资料三维变分同化阶段区域分解并行实现   总被引:2,自引:0,他引:2  
变分同化由于能明显改善同化质量,正在成为数值天气预报的主流同化方法.研究三维变分同化的并行计算,提出了三维变分同化的阶段区域分解、观测资料的自适应划分算法、计算与通信重叠的矩阵转置和周边区域通信以及文件I/O方法,在此基础上实现了MPI并行三维变分原型系统,在由8个双CPU节点组成的Linux机群上并行加速比达到了11.9.  相似文献   

9.
We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.  相似文献   

10.
The results of current research in the development of a Cray algorithm for time-dependent Monte Carlo photon radiation transport is presented. The method that has been developed is a fully vectorized particle-vector scheme. This technique tracks groups of particles simultaneously using a vector-stack formalism based upon particle events. Timing comparisons between this algorithm and the traditional single-particle approach are presented.  相似文献   

11.
In many situations it is important to be able to propose N independent realizations of a given distribution law. We propose a strategy for making N parallel Monte Carlo Markov chains (MCMC) interact in order to get an approximation of an independent N-sample of a given target law. In this method each individual chain proposes candidates for all other chains. We prove that the set of interacting chains is itself a MCMC method for the product of N target measures. Compared to independent parallel chains this method is more time consuming, but we show through examples that it possesses many advantages. This approach is applied to a biomass evolution model.  相似文献   

12.
The EGS4 code, developed at Stanford Linear Accelerator Center, simulates electron-photon cascading phenomena. The original code is inherently sequential: processing one particle at a time. This paper reports on a series of experiments in parallelizing different versions of EGS4. Our parallel experiments were run on a 30-processor Sequent Balance B21 and a 6-processor Symmetry S27. We have considered the following approaches for parallel execution of this application code:
1. (1) Original sequential version modified for parallel processing: 1 processor;
2. (2) Version 1 run multiprocessed: 1 to 29 processors;
3. (3) Sequential version modified for large-grain parallel processing: 1 procssor;
4. (4) Version 3 run using the Sequent Microtasking Library: 1 to 29 processors.

For each approach, we discuss the relative advantages and disadvantages in the areas of coding effort, understandability and portability, as well as performance, and outline a new parallelization approach we are currently pursuing based on Large-Grain Data Flow techniques.  相似文献   


13.
14.
蒙特卡洛树搜索算法是一种常用的强化学习算法,博弈过程中动态空间的指数级增长是制约该算法学习效率的因素。基于并行方法对蒙特卡洛树搜索算法进行优化,提出基于胜率估值传递的并行蒙特卡洛树搜索算法。改进后的并行博弈搜索策略框架包含一个主进程和多个子进程,其中子进程用于探索,主进程根据子进程传递的胜率估值数据进行决策。结合多智能体博弈平台Pommerman进行实验验证,与传统的蒙特卡罗树搜索算法相比,并行蒙特卡罗树搜索算法有效提高了资源利用率、博弈胜率及决策效率。  相似文献   

15.
蒙特卡罗中子-光子输运程序MCNP的并行化   总被引:2,自引:0,他引:2  
1.引 言 随着并行计算机的问世,并行算法和并行系统也不断发展,如 PVM(Parallel VirturalMachine),SMP(Sharae Memory Processors);MPI(Message Passing Interface)和 HPF(High Power Fortran)等,这些并行系统原理基本相同,差异主要是并行指令和数据传递方式.在这些并行系统中,PVM和 MPI系统具有通用性强、系统规模小、使用方便和可移植性强的优点,且安装、测试、编程与实现相对要容易一些,它是当前国际卜公认…  相似文献   

16.
We present the concept of a pseudo-random tree, and generalize the Lehmer pseudo-random number generator as an efficient implementation of the concept. Pseudo-random trees can be used to give reproducibility, as well as speed, in Monte Carlo computations on parallel computers with either the SIMD architecture of the current generation of supercomputer or the MIMD architecture characteristic of the next generation. Monte Carlo simulations based on pseudo-random trees are free of certain pitfalls, even for sequential computers, which can make them considerably more useful.  相似文献   

17.
介绍数字通信系统的广泛应用和Monte Carlo算法的基本思想,重点分析数字通信系统中的差错概率和应用Monte Carlo仿真对存在噪声和干扰的数字通信系统的性能进行评估。  相似文献   

18.
19.
Nowadays, Monte Carlo techniques are very common for the development of Nuclear Medicine systems. Simulations can be very helpful for the optimization of SPECT and PET cameras, and for investigating the importance of several physical effects involved in image formation. In this paper, a simulation study for evaluating various aspects inuencing image formation in detectors for Nuclear Medicine is presented. To this end, the EGSnrc Monte Carlo code has been used, which transports photons and electrons in any material and handling various physical phenomena. Here, some detector systems are simulated, consisting of a parallel-hole collimator and a pixellated scintillator. Various effects are investigated, such as electron transport, uorescence photons, collimator septa penetration. Results are evaluated by means of energy spectra, photon uxes, uniformity of response, SNR and spatial resolution.  相似文献   

20.
序贯Monte Carlo方法能够解决很多实际问题.它的系统模型与Kalman滤波算法相比具有更广泛的适用性,所以研究Monte Carlo方法是很有实际意义的.文中对序贯Monte Carlo算法进行性能分析,对这一方法的跟踪能力进行了仿真实验.采用的仿真系统模型是非线性系统模型.仿真实验比较了EKF、SIS、SIR算法的性能.通过对不同算法的仿真结果之间的分析和比较,得出了有意义的结论.这对一些工程问题的解决是有重要意义的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号