首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 140 毫秒
1.
分子动力学模拟通常用于晶体硅热力学性质的研究,因原子间采用复杂的多体作用势,分子模拟通常面临较高的计算负载,导致计算的时间和空间尺度受限。图形处理器(GPU)采用并行多线程技术,用于计算密集型处理任务,在分子动力学模拟领域中显示巨大的应用潜力。因此,充分利用GPU硬件架构特性提升固态共价晶体硅分子动力学模拟的时空尺度对晶体硅导热机制的研究具有重要意义。基于固态共价晶体硅分子动力学模拟算法,提出面向GPU计算平台的固定邻居算法设计与优化。利用数据结构、分支结构优化等方法解决分子动力学模拟的固定邻居算法全局访存和分支结构的耗时问题,降低数据访存消耗和分支冲突,通过改变线程并行调度方式,在GPU计算平台上实现高性能并行计算,有效解决计算负载问题。实验结果表明,LAMMPS双精度固态晶体硅分子动力学模拟与双精度固定邻居算法的加速比为11.62,HOOMD-blue双精度固态晶体硅分子动力学模拟与双精度固定邻居算法和单精度固定邻居算法的加速比分别为9.39和12.18。  相似文献   

2.
在改进的细胞链表算法中,细胞大小的减少会降低该算法的通信量和粒子之间距离计算的次数,同时会增加部居细胞的数量。多细胞分子动力学算法是分子动力学模拟中普遍使用的并行算法。将改进细胞链表算法的基本思想应用到多细胞分子动力学算法中,推导出了一个分子动力学模拟性能评价模型,并据此提出一个优化模型来加速分子动力学模拟。实验结果表明,根据该优化模型确定的细胞大小可以提高分子动力学模拟程序的性能。  相似文献   

3.
张帅  徐顺  刘倩  金钟 《计算机科学》2018,45(10):291-294, 299
分子动力学模拟存在空间和时间的复杂性,并行加速分子的模拟过程尤为重要。基于GPU硬件数据并行架构的特点,组合分子动力学模拟的原子划分和空间划分的并行策略,优化实现了短程作用力计算Cell Verlet算法,并对分子动力学核心基础算法的GPU实现做了优化和性能分析。Cell Verlet算法实现首先采用原子划分的方式,将每个粒子的模拟计算任务映射到每个GPU线程,并采用空间划分的方式将模拟区域进行元胞划分,建立元胞索引表,实现粒子在模拟空间的实时定位;而在计算粒子间的作用力时,引入希尔伯特空间填充曲线方法来保持数据的线性存储与数据的三维空间分布的局部相关性,以便通过缓存加速GPU的全局内存访问;也利用了访存地址对齐和块内共享等技术来优化设计GPU分子动力学模拟过程。实例测试与对比分析显示,当前的算法实现具有强可扩展性和加速比等优势。  相似文献   

4.
文章针对三维分子动力学并行数值模拟中出现的负载不平衡现象,在静态负载平衡基础上,提出了一种简单有效的动态负载平衡算法。通过对三维分子动力学的并行数值模拟试验,此算法可以使得负载基本达到动态平衡,并进一步提高了并行效率。  相似文献   

5.
分子动力学并行算法研究   总被引:6,自引:1,他引:5  
文章阐述了三种用于分子动力学计算的并行算法,并分别就算法的计算量、通信量和负载均衡进行了分析。  相似文献   

6.
为提高分子动力学模拟在多核共享内存式服务器上的运算速度,在现有的分子动力学并行算法基础上提出了Multi-Critical算法。该算法使用手动划分力矩阵的方法,使多个线程进入不同名的临界区,并使用分块叠加的方法优化了并行算法,提高了并行效率。实验结果表明,对比之前的Critical算法,该算法的加速比和并行效率均有较大幅度的提高。  相似文献   

7.
分子动力学作为一种重要的计算手段在许多领域有着广泛的应用,由于它的计算量比较庞大,因此并行计算方法被越来越多地引入到分子动力学的模拟中。本文在目前常见的SMP集群系统上,根据系统的结构特点,针对分子动力学的三种并行算法:区域分解法、原子分解法和力分解法,利用MPI Pthread的混合编程模型,采用节点间消息传递模式以及节点内部共享存储的编程模式,实现了近程作用分子动力学的两级并行计算。计算结果表明,不同的算法采用了两级并行的方式和原来只有消息传递的并行方式相比,具有不同的计算效率,但是从总体来说采用两级并行的计算方式可以利用更多的计算资源,从而有助于提高计算能力。  相似文献   

8.
基于计算机的分子动力学仿真具有理论分析方法和实验方法无法比拟的优点,但分子动力学仿真算法计算量非常大,特别是在对碳纳米管的大规模粒子数进行仿真处理时,普通的基于CPU的串行算法执行效率低且耗时多。为此,提出基于统一计算设备架构的碳纳米管分子动力学的图形处理单元( GPU)并行算法,设计并实现仿真算法中适合GPU并行运算的分裂算法,将具有竞争资源的运算以非竞争方式运行。实验结果表明,与CPU串行仿真算法相比,分裂算法的运算速度较快,且在只有16个GPU流处理器显卡上可获得十多倍的加速比。  相似文献   

9.
基于分子动力学模拟的改进混合蛙跳算法   总被引:1,自引:0,他引:1  
针对基本的混合蛙跳算法(Shuffled frog leaping algorithm,SFLA)后期搜索速度变慢,容易陷入局部最优解的缺点,借鉴分子动力学(Molecular dynamics,MD)模拟的思想,提出一种基于分子动力学模拟的改进的混合蛙跳算法。该算法将种群中的粒子等效成分子,并提出一种新的分子间作用力计算方法来代替两体间经典的Lennard-Jones作用力计算方法,利用Velocity-Verlet算法和高斯变异算子代替基本混合蛙跳算法的更新策略,有效地平衡了种群的多样性和搜索的高效性。高维多峰函数测试的结果表明,基于分子动力学模拟的改进混合蛙跳算法能提高算法后期跳出局部极值的能力,全局寻优能力明显优于基本的混合蛙跳算法。  相似文献   

10.
提出了用于分子动力学模拟后处理氢键网络图的感知算法。对从头算分子动力学(CPMD)模拟的轨迹文件进行统计分析,得到了水溶液中醋酸单分子水合体系的局域结构特征信息。由径向分布函数得到定义氢键的上限,运用图论方法分析了溶液结构中水合团簇大小和简单氢键环结构及头环分布,其中六元环的突出含量与从头算稳定性分析一致。  相似文献   

11.
为了提高半经典分子动力学模拟中矩阵乘法效率,通过一种稀疏矩阵分解方法化简矩阵乘法,基于OpenMP实现矩阵相乘的Winograd并行算法。该算法将Winograd算法中各部分依次采用OpenMP并行计算,降低了数据通信。在16核服务器上测试表明,该方法能够显著提高半经典分子动力学模拟中矩阵乘法效率,并行加速比能够达到9.47,并具有良好的可扩展性,为大分子体系的模拟提供了可能。  相似文献   

12.
Molecular dynamics simulations investigate local and global motion in molecules. Several parallel computing approaches have been taken to attack the most computationally expensive phase of molecular simulations, the evaluation of long range interactions. This paper reviews these approaches and develops a straightforward but effective algorithm using the machine-independent parallel programming language, Linda. The algorithm was run both on a shared memory parallel computer and on a network of high performance Unix workstations. Performance benchmarks were performed on both systems using two proteins. This algorithm offers a portable cost-effective alternative for molecular dynamics simulations. In view of the increasing numbers of networked workstations, this approach could help make molecular dynamics simulations more easily accessible to the research community.  相似文献   

13.
为了提高分子动力学模拟在对称多处理(SMP)集群上的计算速度,在分子动力学并行方法中引入MPI+TBB的混合并行编程模型。基于该模型,在分子动力学软件LAMMPS中设计并实现混合并行算法,在节点间采用MPI及空间分解技术实施进程级并行,节点内采用TBB及临界区技术实施线程级并行。在SMP集群中的测试表明,该方法在体系较大以及节点数较多时可以明显减少通信时间,使加速比在纯MPI模型上提高45%。结果表明,MPI+TBB混合并行编程模型可促进分子动力学并行模拟且效率明显提升。  相似文献   

14.
《Parallel Computing》1997,23(9):1249-1260
A parallel algorithm for direct simulation Monte Carlo calculation of diatomic molecular rarefied gas flows is presented. For reliable simulation of such flow, an efficient molecular collision model is required. Using the molecular dynamics method, the collision of N2 molecules is simulated. For this molecular dynamics simulation, the parameter decomposition method is applied for parallel computing. By using these results, the statistical collision model of diatomic molecule is constructed. For validation this model is applied to the direct simulation Monte Carlo method to simulate the energy distribution at equilibrium condition and the structure of normal shock wave. For this DSMC calculation, the domain decomposition is applied. It is shown that the collision process of diatomic molecules can be calculated precisely and the parallel algorithm can be efficiently implemented on the parallel computer.  相似文献   

15.
A scalable parallel algorithm has been designed to study long-time dynamics of many-atom systems based on the nudged elastic band method, which performs mutually constrained molecular dynamics simulations for a sequence of atomic configurations (or states) to obtain a minimum energy path between initial and final local minimum-energy states. A directionally heated nudged elastic band method is introduced to search for thermally activated events without the knowledge of final states, which is then applied to an ensemble of bands in a path ensemble method for long-time simulation in the framework of the transition state theory. The resulting molecular kinetics (MK) simulation method is parallelized with a space-time-ensemble parallel nudged elastic band (STEP-NEB) algorithm, which employs spatial decomposition within each state, while temporal parallelism across the states within each band and band-ensemble parallelism are implemented using a hierarchy of communicator constructs in the Message Passing Interface library. The STEP-NEB algorithm exhibits good scalability with respect to spatial, temporal and ensemble decompositions on massively parallel computers. The MK simulation method is used to study low strain-rate deformation of amorphous silica.  相似文献   

16.
针对经典分子动力学和PIC方法等粒子类模拟方法具有粒子动态移动、粒子计算局部性好等共性,首先,提出了粒子量数据片对象.该对象是单网格片上的一团粒子,其中网格片是包含多个网格单元的矩形区域.然后,设计了并行算法,包括对象之间的粒子迁移和数据交换以及动态负载平衡.最后,在JASMIN框架上具体实现,进而开发了并行经典分子动力学程序和并行PIC程序.在64个处理器上实测表明,并行PIC程序模拟包含3百万个网格、2千万个粒子的复杂物理模型时,获得了80%的并行效率.  相似文献   

17.

The iterative Multilevel Averaging Weight (MAW) algorithm presented in paper [1] is modified to solve the dynamic load imbalance problems arising from the two-dimensional short-range parallel molecular dynamics simulations in this paper. Firstly, five types of load balancing models are given which allows detailed studies of the algorithm. In particular, it shows that for strip decomposition, the number of iteration needs for the system to converge from an initially unbalanced state to a well balanced state is bounded by 2 log P , where P is the number of processors. This result can permit the algorithm to efficiently track fluctuations in the molecular density as the simulation progresses, and is much better than that of the Cellular Automaton Diffusion (CAD) scheme presented in paper [2] . Secondly, we apply MAW algorithm to solve the load imbalance problem in the parallel molecular dynamics simulation for higher speed wall collisions. At last, the numerical experimental results and parallel computing performance with MPI-1.2 under a PC-Cluster consists of 64 Pentium-III 500 MHz nodes connected by 100 Mbps Switches are given in this paper.  相似文献   

18.
在Hadoop开源云计算平台上运行分子模拟程序,具有节省软硬件投资、缩短模拟时间等研究意义。然而,该平台并不擅长科学计算类应用中所涉及的快速迭代和子任务间通信。为此,在原子分解法基础上提出了三种解决方案并利用"读写HDFS同步法"实现短程作用力有效的分子动力学模拟的并行算法。在一个Hadoop集群上测试和分析了程序的可扩展性、加速比和各部分耗时情况,结果表明在大规模体系模拟中有较好的效果,最高取得了28倍的加速比。实验证明,Hadoop并行技术在分子模拟中有着较高的经济价值和实用价值。  相似文献   

19.
A domain decomposition algorithm for molecular dynamics simulation of atomic and molecular systems with arbitrary shape and non-periodic boundary conditions is described. The molecular dynamics program uses cell multipole method for efficient calculation of long range electrostatic interactions and a multiple time step method to facilitate bigger time steps. The system is enclosed in a cube and the cube is divided into a hierarchy of cells. The deepest level cells are assigned to processors such that each processor has contiguous cells and static load balancing is achieved by redistributing the cells so that each processor has approximately same number of atoms. The resulting domains have irregular shape and may have more than 26 neighbors. Atoms constituting bond angles and torsion angles may straddle more than two processors. An efficient strategy is devised for initial assignment and subsequent reassignment of such multiple-atom potentials to processors. At each step, computation is overlapped with communication greatly reducing the effect of communication overhead on parallel performance. The algorithm is tested on a spherical cluster of water molecules, a hexasaccharide and an enzyme both solvated by a spherical cluster of water molecules. In each case a spherical boundary containing oxygen atoms with only repulsive interactions is used to prevent evaporation of water molecules. The algorithm shows excellent parallel efficiency even for small number of cells/atoms per processor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号