首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We provide a theoretical analysis of the communication requirements of parallel algorithms for molecular dynamic simulations. We describe two commonly used algorithms, space decomposition and force decomposition, and analyze their communication requirements; each is better in a distinct computation regime. We next introduce a new hybrid algorithm that further reduces communication. We show that the new algorithm is optimal, by providing a matching lower bound.  相似文献   

2.
Most simulations of colloidal suspensions treat the solvent implicitly or as a continuum. However as particle size decreases to the nanometer scale, this approximation fails and one needs to treat the solvent explicitly. Due to the large number of smaller solvent particles, such simulations are computationally challenging. Additionally, as the ratio of nanoparticle size to solvent size increases, commonly-used molecular dynamics algorithms for neighbor finding and parallel communication become inefficient. Here we present modified algorithms that enable fast single processor performance and reasonable parallel scalability for mixtures with a wide range of particle size ratios. The methods developed are applicable for any system with widely varying force distance cutoffs, independent of particle sizes and independent of the interaction potential. As a demonstration of the new algorithm's effectiveness, we present results for the pair correlation function and diffusion constant for mixtures where colloidal particles interact via integrated potentials. In these systems, with nanoparticles 20 times larger than the surrounding solvent particles, our parallel molecular dynamics code runs more than 100 times faster using the new algorithms.  相似文献   

3.
为减少分子动力学模拟中短程力计算的时间消耗,设计并实现基于现场可编程门阵列的分子动力学模拟匹配单元。理论上,分析了分子动力学模拟中粒子间作用力的物理规律,提出两种满足短程力计算要求的粒子对的筛选方法:偏序法和平面法。技术上,使用新兴的硬件描述语言SpinalHDL,在Xilinx Virtex UltraScale+ HBM VCU128 FPGA板卡上实现了匹配单元。最后,将硬件测试结果与理论结果进行对比,验证了匹配单元可以有效过滤掉对短程力计算没有贡献的粒子对。同时对使用偏序法、平面法和使用直接计算法两种不同情况下的资源消耗进行对比分析,表明使用偏序法、平面法可以节省系统70%的DSP资源。  相似文献   

4.
In molecular dynamics (MD) simulations, calculations of potentials and their derivatives by coordinate, i.e., forces, in a pairwise additive manner such as the Lennard–Jones interactions and a short-range part of the Coulombic interactions form the main part of arithmetic operations. It is essential to achieve high thread-level parallelization efficiency of these pairwise additive calculations of potentials and forces to use current supercomputers with many-core architectures effectively. In this paper, we propose four new thread-level parallelization algorithms for the pairwise additive potential and force calculations. We implement the four codes in a MD calculation code based on the fast multipole method. Performance benchmarks were taken on the FX100 supercomputer and Intel Xeon Phi coprocessor. The code succeeds in achieving high thread-level parallelization efficiency with 32 threads on the FX100 and up to 60 threads on the Xeon Phi.  相似文献   

5.
We describe the algorithms for NVT and NPT-ensemble simulations developed within the parallel molecular dynamics program GBMOLDD. This program uses the domain decomposition algorithm and is targeted at large-scale simulations of molecular systems (particularly polymers and liquid crystals) composed of both spherically-symmetric and nonspherical sites. The nonspherical sites can be described either by a Gay-Berne potential or by soft repulsive spherocylinders. The molecules can be of arbitrary topology and the intramolecular forces are described via standard force fields. We tested the stability of both leap-frog and velocity-Verlet integrators on two “real-life” systems—a nematic liquid crystal phase of 1944 one-site Gay-Berne molecules and on 512 flexible liquid-crystalline dimers. In both cases the algorithm demonstrates good stability over the typical simulation times required for new phase formation and/or molecular relaxation processes.  相似文献   

6.
An approach is proposed to improve the efficiency of fourth-order algorithms for numerical integration of the equations of motion in molecular dynamics simulations. The approach is based on an extension of the decomposition scheme by introducing extra evolution subpropagators. The extended set of parameters of the integration is then determined by reducing the norm of truncation terms to a minimum. In such a way, we derive new explicit symplectic Forest-Ruth- and Suzuki-like integrators and present them in time-reversible velocity and position forms. It is proven that these optimized integrators lead to the best accuracy in the calculations at the same computational cost among all possible algorithms of the fourth order from a given decomposition class. It is shown also that the Forest-Ruth-like algorithms, which are based on direct decomposition of exponential propagators, provide better optimization than their Suzuki-like counterparts which represent compositions of second-order schemes. In particular, using our optimized Forest-Ruth-like algorithms allows us to increase the efficiency of the computations by more than ten times with respect to that of the original integrator by Forest and Ruth, and by approximately five times with respect to Suzuki's approach. The theoretical predictions are confirmed in molecular dynamics simulations of a Lennard-Jones fluid. A special case of the optimization of the proposed Forest-Ruth-like algorithms to celestial mechanics simulations is considered as well.  相似文献   

7.
The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines – (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both multi-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS, however, the methods can be applied in many molecular dynamics codes. Specifically, we describe algorithms for efficient short range force calculation on hybrid high-performance machines. We describe an approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPUs and 180 CPU cores.  相似文献   

8.
分子动力学作为一种重要的计算手段在许多领域有着广泛的应用,由于它的计算量比较庞大,因此并行计算方法被越来越多地引入到分子动力学的模拟中。本文在目前常见的SMP集群系统上,根据系统的结构特点,针对分子动力学的三种并行算法:区域分解法、原子分解法和力分解法,利用MPI Pthread的混合编程模型,采用节点间消息传递模式以及节点内部共享存储的编程模式,实现了近程作用分子动力学的两级并行计算。计算结果表明,不同的算法采用了两级并行的方式和原来只有消息传递的并行方式相比,具有不同的计算效率,但是从总体来说采用两级并行的计算方式可以利用更多的计算资源,从而有助于提高计算能力。  相似文献   

9.
《Computers & chemistry》1990,14(3):219-224
A method for parallelizing the non-bonded pair list generation and non-bonded force calculation algorithm for molecular dynamics is presented. Using the parallelism inherent to existing algorithms, it is possible, with minor modifications, to adapt the non-bonded routines to multiple-instruction, multiple-data (MIMD) computer architectures. This methodology has been applied to the molecular dynamics program GROMOS for the Stellar GS1000 Graphics Supercomputer. Aspects of the Stellar GS1000 architecture and programming environment are presented with attention to the performance of the molecular dynamics program. A speed enhancement factor of about 3 has been obtained relative to the serial execution of the program, which is close to the theoretical maximum factor of 4 for this machine. The overall speed enhancement factor increases to about 6 with the additional use of vectorization for a version that has been extensively rewritten to be more highly vectorizable than the standard code where gains from vectorization are slight. In the former case, the program executes at about 35% of the speed obtained on a single-processor Cray X-MP.  相似文献   

10.
In this work, we describe the development of softcore potential functions that permit occasional "tunneling" through the regions of conformational space during molecular dynamics (MD) simulations, which would otherwise be sterically prohibited. The modification consists of a truncation of the nonbonded interaction before the steeply repulsive region encountered at short interatomic distances. This modification affects both Lennard-Jones and Coulomb parts of the nonbonded potential. Critical to success is the choice of appropriate pairwise switching distances at which this modification should be made. In the present work, these are calculated based on potential of mean force functions extracted from model system molecular dynamics simulations. We believe that these functions describe the dynamic short-range interactions much better than mean force potentials derived from an ensemble of static structures (e.g. protein data bank (PDB)). Once a set of mean force potentials is obtained, a single empirical parameter, effective barrier height, is employed to determine switching distances for all pairwise atomic interactions. Changing this single parameter allows adjustment of the "softness" of the whole system. We tested the applicability of the new softcore potentials in a loop structure optimization study. The H1 loop in the antibody 17/9 was selected as our test case because substantial repacking of loop residues in the dense protein environment is necessary for successful relaxation of random initial conformations. Softcore simulations converted to correct loop conformations, in contrast to standard simulations which never sampled this structure even after 10 ns. The resulting root mean square deviation (RMSD) values (below 1.3 A for all heavy atoms of the loop) demonstrate the usefulness of the approach based on mean force derived softcore functions.  相似文献   

11.
使用GPU加速分子动力学模拟中的非绑定力计算   总被引:1,自引:0,他引:1  
在分子动力学模拟(MD)中,对非绑定力的计算需要花费大量的时间。本文提出了基于CUDA和Brook+的两种双精度算法,分别在NVIDIA和AMD两款主流GPU上实现了非绑定力的计算,借助GPU的计算能力加速了整个MD程序。算法对MD进行了任务分割,采用区域分解的方法将非绑定力的计算映射到GPU的计算核心上,同时针对两款GPU的各自特点提出了线程块内共享存储、最小化数据集两种优化方法。性能测试结果表明,与Intel Xeon 2.6GHzCPU的单核相比,43.2万粒子的高速粒子碰撞模拟,在配置NVIDIA Tesla C1060的系统上性能提高了6.5倍,在配置AMD HD4870的系统上性能提高了4.8倍。  相似文献   

12.
In this work, different global optimization techniques are assessed for the automated development of molecular force fields, as used in molecular dynamics and Monte Carlo simulations. The quest of finding suitable force field parameters is treated as a mathematical minimization problem. Intricate problem characteristics such as extremely costly and even abortive simulations, noisy simulation results, and especially multiple local minima naturally lead to the use of sophisticated global optimization algorithms. Five diverse algorithms (pure random search, recursive random search, CMA-ES, differential evolution, and taboo search) are compared to our own tailor-made solution named CoSMoS. CoSMoS is an automated workflow. It models the parameters’ influence on the simulation observables to detect a globally optimal set of parameters. It is shown how and why this approach is superior to other algorithms. Applied to suitable test functions and simulations for phosgene, CoSMoS effectively reduces the number of required simulations and real time for the optimization task.  相似文献   

13.
Using the recently developed smart wall molecular dynamics algorithm, shear-driven gas flows in nano-scale channels are investigated to reveal the surface–gas interaction effects for flows in the transition and free molecular flow regimes. For the specified surface properties and gas–surface pair interactions, density and stress profiles exhibit a universal behavior inside the wall force penetration region at different flow conditions. Shear stress results are utilized to calculate the tangential momentum accommodation coefficient (TMAC) between argon gas and FCC walls. The TMAC value is shown to be independent of the flow properties and Knudsen number in all simulations. Velocity profiles show distinct deviations from the kinetic theory based solutions inside the wall force penetration depth, while they match the linearized Boltzmann equation solution outside these zones. Results indicate emergence of the wall force field penetration depth as an additional length scale for gas flows in nano-channels, breaking the dynamic similarity between rarefied and nano-scale gas flows solely based on the Knudsen and Mach numbers.  相似文献   

14.
To achieve scalable parallel performance in molecular dynamics simulations, we have modeled and implemented several dynamic spatial domain decomposition algorithms. The modeling is based upon the bulk synchronous parallel architecture model (BSP), which describes supersteps of computation, communication, and synchronization. Using this model, we have developed prototypes that explore the differing costs of several spatial decomposition algorithms and then use this data to drive implementation of our molecular dynamics simulator,Sigma. The parallel implementation is not bound to the limitations of the BSP model, allowing us to extend the spatial decomposition algorithm. For an initial decomposition, we use one of the successful decomposition strategies from the BSP study and then subsequently use performance data to adjust the decomposition, dynamically improving the load balance. The motivating reason to use historical performance data is that the computation to predict a better decomposition increases in cost with the quality of prediction, while the measurement of past work often has hardware support, requiring only a slight amount of work to modify the decomposition for future simulation steps. In this paper, we present our adaptive spatial decomposition algorithms, the results of modeling them with the BSP, the enhanced spatial decomposition algorithm, and its performance results on computers available locally and at the national supercomputer centers.  相似文献   

15.
Lammps是用于分子动力学模拟及其相关问题的一款开源软件,可利用其了解固体、液体性质,应用广泛。支持使用CUDA及OpenCL进行GPU加速。因OpenCL具有跨平台特性,将其作为研究重点。总结了OpenCL内核编程中需要注意的设计原则并阐述了一种改进的阿姆达尔定律用于衡量异构平台理论加速性能。测试了Lammps短程力计算在Y485P平台下的性能参数。通过对短程力计算中的关键部分如邻接表的建立及短程力计算部分的内核代码进行优化,使其取得了更好的加速效果。  相似文献   

16.
Simulations of interacting particles are common in science and engineering, appearing in such diverse disciplines as astrophysics, fluid dynamics, molecular physics, and materials science. These simulations are often computationally intensive and so are natural candidates for massively parallel computing. Many-body simulations that directly compute interactions between pairs of particles, be they short-range or long-range interactions, have been parallelized in several standard ways. The simplest approaches require all-to-all communication, an expensive communication step. The fastest methods assign a group of nearby particles to a processor, which can lead to load imbalance and be difficult to implement efficiently. We present a new approach, suitable for direct simulations, that avoids all-to-all communication without requiring any geometric clustering. We demonstrate its utility in several parallel molecular dynamics simulations and compare performance against other parallel approaches. The new algorithm proves to be fastest for simulations of up to several thousand particles.  相似文献   

17.
The hydration of carbohydrates plays a key role in many biological processes. Molecular dynamics simulations provide an effective tool for investigating the hydration of complex solutes such as carbohydrates. In this article we devise an algorithm for the calculation of two-dimensional radial pair distributions describing the probability of finding a water molecule in a site defined by two reference atoms. The normalized 2D radial pair distribution is proposed as an effective tool for investigating and comparing localized or ordered water sites around flexible molecules such as carbohydrates when analyzing molecular dynamics simulations and the utility of 2D radial pair distributions is demonstrated using sucrose as an example. In this relatively simple structure, 2D radial pair distributions were able to characterize and quantify the importance of two unique interresidue hydration sites in which a water molecule is forming a bridge between the glycopyranosyl and fructofuranosyl residues. The approach is proposed to be a valuable tool for comparing and understanding the hydration of flexible biomolecules such as carbohydrates.  相似文献   

18.
A mathematical formulation for the 3D vortex method has been developed for calculation using a special-purpose computer MDGRAPE-2 that was originally designed for molecular dynamics simulations. We made an assessment of this hardware for a few representative problems and compared the results with and without it. It is found that the generation of appropriate function tables, which are used to call libraries, embedded in MDGRAPE-2 is of primary importance in order to retain accuracy. The error arising from the approximation is evaluated by calculating a pair of vortex rings impinging to themselves. Consequently, acceleration about 50 times greater is achieved by MDGRAPE-2 while the error in the statistical quantities such as kinetic energy and enstrophy remain negligible.  相似文献   

19.
In the pursuit to study the parameterization problem of molecular models with a broad perspective, this paper is focused on an isolated aspect: It is investigated, by which algorithms parameters can be best optimized simultaneously to different types of target data (experimental or theoretical) over a range of temperatures with the lowest number of iteration steps. As an example, nitrogen is regarded, where the intermolecular interactions are well described by the quadrupolar two-center Lennard-Jones model that has four state-independent parameters. The target data comprise experimental values for saturated liquid density, enthalpy of vaporization, and vapor pressure. For the purpose of testing algorithms, molecular simulations are entirely replaced by fit functions of vapor-liquid equilibrium (VLE) properties from the literature to assess efficiently the diverse numerical optimization algorithms investigated, being state-of-the-art gradient-based methods with very good convergency qualities. Additionally, artificial noise was superimposed onto the VLE fit results to evaluate the numerical optimization algorithms so that the calculation of molecular simulation data was mimicked. Large differences in the behavior of the individual optimization algorithms are found and some are identified to be capable to handle noisy function values.  相似文献   

20.
耗散粒子动力学(DPD)模拟是一种重要的研究流体动力学特性的计算模拟方法,基于Intel MIC平台设计实现了面向大规模耗散粒子动力学模拟,充分结合了DPD模拟本身的特性和MIC平台的特征。对DPD模拟中的近邻列表构建和短程作用力关键代码实现了向量化优化,在CPU和MIC协处理器之间采用任务计算负载平衡机制,支持MPI进程内线程数量负载平衡控制。分别在原型程序上和LAMMPS集成中做了性能对比分析,实验结果显示了引入相关优化技术的有效性,为进一步研究面向MIC众核平台的分子动力学相关工作奠定了基础。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号