首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 640 毫秒
1.
针对多块结构重叠网格并行装配的问题,设计了支持初始网格系统细分的多块结构重叠网格框架,并在此框架基础上提出了基于局部洞映射的并行挖洞算法、格心网格下可跨块寻点的并行搜索算法,使之可适应大规模并行数值模拟时的分布式计算环境。此算法被模块化的集成到了自主研发的大规模多块结构网格数值求解器(CCFD-MGMB)中,可支持大规模并行非定常多体分离数值模拟。并行测试结果表明,本文发展的算法具有良好的局部数据结构组织,数据可扩展性强。数值应用模拟结果表明了该算法的有效性及正确性,千核并行非定常数值计算效率(相对于64核)可达58%。  相似文献   

2.
在油藏数值模拟并行计算中,提高计算速度和资源利用率是一个重要的研究方向,给出分布式并行环境下一种多层油藏模拟并行计算的整体优化方法,其特点是使用高效的区域分解方法并行求解,动态选择两种不同的计算粒度,有效地克服了负载不均衡带来的性能下降问题,实际模型计算表明,此方法策略减少了整体模拟计算时间,并获得较高加速比,采用的算法适用于一类多层油藏模型问题。  相似文献   

3.
用并行遗传算法解决带约束并行多机调度问题   总被引:2,自引:0,他引:2  
吴昊  程锦松 《微机发展》2001,11(1):19-22
遗传算法是一种全局优化的数值计算方法,它存在自然并行性,本文提出了一种解带约束并行多机调度问题的主从式控制网络并行遗传算法,并在PVM环境下实现。计算结果表明,并行遗传算法是有效的,且能适用于大规模并行多机调度问题。  相似文献   

4.
一种基于设计模式的三阶段并行程序设计方法   总被引:7,自引:1,他引:7  
并行程序的可编程性是并行计算的难点之一,使用传统的方法对非数值问题进行并行求解则更加困难。通过对设计模式概念的扩展,定义了算法模式与结构模式,以此为基础提出了一种基于设计模式的三阶段并行程序设计方法,并通过算法模式库与结构模式库实现对该方法的系统支持,该方法不仅可有于一般的数值问题求解,而且在处理复杂的非数值问题时与传统方法相比要简单得多,通过一个简单的数值问题和一个复杂非数值问题作为实例说明了使用这一方法进行问题并行求解与并行程序设计的过程。  相似文献   

5.
文章针对三维分子动力学并行数值模拟中出现的负载不平衡现象,在静态负载平衡基础上,提出了一种简单有效的动态负载平衡算法。通过对三维分子动力学的并行数值模拟试验,此算法可以使得负载基本达到动态平衡,并进一步提高了并行效率。  相似文献   

6.
片级三维寄生电容的并行提取算法   总被引:1,自引:0,他引:1  
随着多核CPU和分布式机群的日益普及,并行计算被日益广泛地应用于科学与工程实践中,以解决复杂的数值模拟问题.提出片级三维寄生电容的并行提取算法,它基于三维层次式块边界元素法,应用双向重叠组合思想将芯片划分为4类大小不同的"窗口";采用可变长的动态混合队列进行静态/动态结合的任务调度方法将全部"窗口"分配到不同进程,并在稀疏矩阵求和及进程间的规约求和运算中采用了提高并行效率的技术,达到了较好的负载平衡和较高的加速比.在分布式机群上采用消息传递接口编程的实验,验证了文中算法的有效性.  相似文献   

7.
基于精英选择和个体迁移的多目标遗传算法   总被引:6,自引:0,他引:6       下载免费PDF全文
提出基于遗传算法求解多目标优化问题的方法,将多目标问题分解成多个单目标优化问题,用遗传算法分别在每个单目标种群中并行搜索.在进化过程中的每一代,采用精英选择和个体迁移策略加快多个目标的并行搜索,提出了控制Pareto最优解数量并保持个体多样性的有限精度法,同时还提出了多目标遗传算法的终止条件.数值实验说明所提出的算法能较快地找到一组分布广泛且均匀的Pareto最优解.  相似文献   

8.
遗传算法是一种全局优化的数值计算方法。它存在自然并行性。本文提出一种解带约束并行多机调度问题的主从式控制网络并行遗传算法,并在PVM环境下实现。计算结果表明,并行遗传算法是有效的,且能适用于大规模并行多机调度问题。  相似文献   

9.
电力系统动态仿真   总被引:1,自引:0,他引:1  
本文综述了电力系统动态仿真中的数学模型结构,数值计算方法,以及目前动态仿真中所出现的刚性,数值稳定性,误差,非线性特性等问题和这些问题的解决办法。文中着重阐述了电力系统动态仿真的分割求解法和联立求解法的各种数值计算格式,以及为提高数值稳定性和解决刚性问题所采取的措施,最后介绍了目前电力系统仿真的新发展和趋势。  相似文献   

10.
针对数值模拟中网格生成过程中的弊端和冗长计算时间问题,对无网格并行SPH方法研究很有必要,而最花费时间的就是粒子搜索算法。在充分研究桶搜索算法的基础上,提出单元搜索算法,结合动态负载平衡技术,使并行结果得到显著提高。  相似文献   

11.

The iterative Multilevel Averaging Weight (MAW) algorithm presented in paper [1] is modified to solve the dynamic load imbalance problems arising from the two-dimensional short-range parallel molecular dynamics simulations in this paper. Firstly, five types of load balancing models are given which allows detailed studies of the algorithm. In particular, it shows that for strip decomposition, the number of iteration needs for the system to converge from an initially unbalanced state to a well balanced state is bounded by 2 log P , where P is the number of processors. This result can permit the algorithm to efficiently track fluctuations in the molecular density as the simulation progresses, and is much better than that of the Cellular Automaton Diffusion (CAD) scheme presented in paper [2] . Secondly, we apply MAW algorithm to solve the load imbalance problem in the parallel molecular dynamics simulation for higher speed wall collisions. At last, the numerical experimental results and parallel computing performance with MPI-1.2 under a PC-Cluster consists of 64 Pentium-III 500 MHz nodes connected by 100 Mbps Switches are given in this paper.  相似文献   

12.
一种基于实测的高维动态负载平衡方法   总被引:3,自引:0,他引:3  
曹小林  莫则尧 《计算机学报》2005,28(9):1440-1446
针对大规模科学计算中的强非规则结构负载问题,作者开发出一种基于实测的动态负载平衡方法.首先,将由规则结构化网格组成的模拟区域剖分成多块;其次,把块的高维坐标转换成一维Hilbert空间填充曲线(HSFC)索引;然后,基于实测信息采用多层均权法剖分按一维HSFC索引排列的块;最后根据剖分信息重分配块以平衡负载.它把仅适用于一维的多层均权法扩展到二维和三维,并引入更多的实测信息和块数据结构.与ISP方法相比,该方法在64个CPU上提高负载平衡效率10%,在某MPP的500个CPU上模拟强非规则结构负载问题时,获得了88%的负载平衡效率和84%的并行效率.  相似文献   

13.
Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method calledPLUMto dynamically balance the processor workloads with a global view. This paper describes the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. A data redistribution model is also presented that predicts the remapping cost on the SP2. This model is required to determine whether the gain from a balanced workload distribution offsets the cost of data movement. Results presented in this paper demonstrate thatPLUMis an effective dynamic load balancing strategy which remains viable on a large number of processors.  相似文献   

14.
We develop scalable parallel domain decomposition algorithms for nonlinear complementarity problems including, for example, obstacle problems and free boundary value problems. Semismooth Newton is a popular approach for such problems, however, the method is not suitable for large scale calculations because the number of Newton iterations is not scalable with respect to the grid size; i.e., when the grid is refined, the number of Newton iterations often increases drastically. In this paper, we introduce a family of Newton-Krylov-Schwarz methods based on a smoothed grid sequencing method, a semismooth inexact Newton method, and a two-grid restricted overlapping Schwarz preconditioner. We show numerically that such an approach is totally scalable in the sense that the number of Newton iterations and the number of linear iterations are both nearly independent of the grid size and the number of processors. In addition, the method is not sensitive to the sharp discontinuity often associated with obstacle problems. We present numerical results for several large scale calculations obtained on machines with hundreds of processors.  相似文献   

15.
数据重分布是实现消息传递环境下负载平衡的重要手段,提出了数据交错分布的模型问题及模型问题的并行计算模型,分析了模型问题在消息传递环境下的实现,讨论了性能和适用条件,给出了分析结果,讨论了通信与计算的时间重叠问题,将数据交错重分布负载平衡技术应用到非平衡刚性动力学方程组的并行计算中,获得了很好的负载平衡效果。  相似文献   

16.
燃烧数值模拟计算通常采用非结构网格模拟计算区域。在非结构网格上进行并行模拟计算时,其自适应方式使得不同进程上的计算负载频繁变动,且差异巨大,导致并行计算效率低下。为了提高并行计算的效率,一个有效的方法是采用动态负载平衡技术。提出一种针对燃烧的化学反应状态的动态负载平衡方法,该方法采用不同策略对化学反应不同阶段各进程上的计算负载进行预测,根据预测结果平均进程间的计算任务,达到负载平衡。实验分析表明,该方法能有效地降低进程间的负载不平衡程度,使得模拟计算的总体运行时间降低了10%。  相似文献   

17.
A practical processor self-scheduling scheme, trapezoid self-scheduling, is proposed for arbitrary parallel nested loops in shared-memory multiprocessors. Generally, loops are the richest source of parallelism in parallel programs. To dynamically allocate loop iterations to processors, one may achieve load balancing among processors at the expense of run-time scheduling overhead. By linearly decreasing the chunk size at run time, the best tradeoff between the scheduling overhead and balanced workload can be obtained in the proposed trapezoid self-scheduling approach. Due to its simplicity and flexibility, this approach can be efficiently implemented in any parallel compiler. The small and predictable number of chores also allow efficient management of memory in a static fashion. The experiments conducted in a 96-node Butterfly GP-1000 clearly show the advantage of the trapezoid self-scheduling over other well-known self-scheduling approaches  相似文献   

18.
现有并行识别方法用于众核处理器时存在一定不足,当选择的循环并行维迭代数较少时可能导致严重地负载不均衡。针对这一问题,提出了一种面向众核处理器的多维并行识别方法,在现有并行识别方法无法做到较好的负载均衡时,选择嵌套循环的多个维进行并行,将多个并行维的迭代空间合并后再做任务划分,减少负载不均衡对程序并行效率的影响。此方法已在课题组开发的自动并行化系统中进行了实现,实际应用过程中能够提升一些应用程序在众核处理器上并行执行的效率。  相似文献   

19.
In this paper we consider the scalability of parallel space‐filling curve generation as implemented through parallel sorting algorithms. Multiple sorting algorithms are studied and results show that space‐filling curves can be generated quickly in parallel on thousands of processors. In addition, performance models are presented that are consistent with measured performance and offer insight into performance on still larger numbers of processors. At large numbers of processors, the scalability of adaptive mesh refined codes depends on the individual components of the adaptive solver. One such component is the dynamic load balancer. In adaptive mesh refined codes, the mesh is constantly changing resulting in load imbalance among the processors requiring a load‐balancing phase. The load balancing may occur often, requiring the load balancer to perform quickly. One common method for dynamic load balancing is to use space‐filling curves. Space‐filling curves, in particular the Hilbert curve, generate good partitions quickly in serial. However, at tens and hundreds of thousands of processors serial generation of space‐filling curves will hinder scalability. In order to avoid this issue we have developed a method that generates space‐filling curves quickly in parallel by reducing the generation to integer sorting. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

20.
A repartitioning hypergraph model for dynamic load balancing   总被引:1,自引:0,他引:1  
In parallel adaptive applications, the computational structure of the applications changes over time, leading to load imbalances even though the initial load distributions were balanced. To restore balance and to keep communication volume low in further iterations of the applications, dynamic load balancing (repartitioning) of the changed computational structure is required. Repartitioning differs from static load balancing (partitioning) due to the additional requirement of minimizing migration cost to move data from an existing partition to a new partition. In this paper, we present a novel repartitioning hypergraph model for dynamic load balancing that accounts for both communication volume in the application and migration cost to move data, in order to minimize the overall cost. The use of a hypergraph-based model allows us to accurately model communication costs rather than approximate them with graph-based models. We show that the new model can be realized using hypergraph partitioning with fixed vertices and describe our parallel multilevel implementation within the Zoltan load balancing toolkit. To the best of our knowledge, this is the first implementation for dynamic load balancing based on hypergraph partitioning. To demonstrate the effectiveness of our approach, we conducted experiments on a Linux cluster with 1024 processors. The results show that, in terms of reducing total cost, our new model compares favorably to the graph-based dynamic load balancing approaches, and multilevel approaches improve the repartitioning quality significantly.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号