首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
热传导方程基于界面修正的迭代并行计算方法   总被引:3,自引:0,他引:3  
在许多实际计算中,由于对时间步长稳定性的要求,辐射热传导方程的计算通常采用隐式格式.隐式格式难以直接在并行机上实施,显式差分格式尽管易于在并行机上实施,但它的稳定性条件苛刻.在计算问题规模相当大时,例如需要具有数百、数千甚至上万台处理器的大型并行计算机进行计算时,数据的强相关与全局通讯等问题成为制约实现高性能计算的突出的瓶颈问题.因此,改造现有的隐式格式,研究适应于大型并行计算机的并行计算方法是目前大型科学与工程计算中迫切需要解决的具有挑战性的问题.本文简要介绍基于界面修正的迭代并行计算格式的构造及基本性质.所提出的并行格式的构造方法是将预测-校正技术应用于分区子区域的内边界,且与子区域内部的迭代求解相结合,讨论了这些并行格式的稳定性、收敛性与并行度等性质.  相似文献   

2.
我们提出了两个具有改进稳定性限制条件的新显格式.与经典显格式相比,稳定性限制条 件分别对两维抛物问题放宽了4倍,对一维问题放宽了2倍,同时它的精度与经典全隐格式 的相同.然后,我们通过在内边界点使用大步长的这种新显格式,在内点使用全隐格式,设计 了一个有限差分区域分解算法,稳定性限制条件分别对一维抛物问题放宽了2m2倍,对二维 问题放宽了4m2倍.从而我们能使用一个大的时间步长,这使我们在并行求解抛物问题时能 节省大量的计算量.  相似文献   

3.
提出一种按照计算域分解的并行化方法来构建等几何分析的刚度矩阵和右侧向量.将计算域分解成为若干个不相交的子区域,然后为每个区域分配一个处理器,所有处理器并行进行子区域上面的计算,所有处理器完成子区域的计算以后,使用一个快速的归并算法完成线性系统的装配.实验表明,本文提出的方法在8核的机器上可以达到6.46的加速比,能够在4秒左右的时间计算680万个矩阵元素个数.使用Intel MKL稀疏求解器来求解线性系统,本文的等几何分析求解器能够在大约10秒的时间内求解52万的自由度,本文的方法比ISOGAT速度要快上万倍.  相似文献   

4.
三维激光烧蚀流体界面不稳定性程序的并行化   总被引:1,自引:0,他引:1  
在共享存储并行机和MPP并行机上,基于MPI(MessagePassingInterface)并行编程环境,本文研究三维激光烧蚀界而不稳定性程序(Lared-S)的并行实现.三维激光烧蚀的数值模拟采用分裂方法,其90%以上的计算负载存在于流体方程和热传导方程的求解(流体方程的求解采用分裂显格式,热传导方程的求解采用分裂隐格式).本文给出基于三维分裂格式的交替平面数据通信模式.分裂隐格式的求解转化为三对角方程组的求解,其并行实现采用块流水线并行算法.数值实验结果表明交替平面数据通信策略和块流水线并行算法是有效且可扩展的.在共享存储并行机上,应用64台处理机获得93%以上的并行效率;在MPP并行机上,应用128台处理机获得90%以上的并行效率.  相似文献   

5.
本文描述了化学复合驱数值模拟程序UTCHEM在分布式内存多计算机并行系统SMP-CLUSTER上并行化的关键技术。化学复合驱并行模型采用单程序多数据(SPMD)程序模型,利用区域分解方法将整个求解区域分解为子区域,使得多个计算节点同时求解一个单一的模拟问题。各计算节点通过消息传递对重叠区域的共享数据进行通信,以协调各节点之问的计算。目前仅对压力方程组求解部分进行了并行化实现。测试结果显示了较好的并行效率。  相似文献   

6.
非线性Leland方程(支付交易费用的期权定价模型)数值解法的研究具有重要的实际意义,本文对非线性Leland方程构造了一种具有并行本性的差分格式一一交替分段CrankoNicolson(ASC—N)格式,给出差分格式解的存在唯一性、稳定性分析及解的误差估计,理论分析表明ASC—N格式为无条件稳定的并行差分格式.数值试验显示ASC—N格式的计算精度与经典的Crank—Nicolson格式相当,但其计算时间要比经典的Crank—Nicolson格式节省将近50%,数值试验验证了理论分析,表明本文的ASC—N格式对求解非线性Leland方程是有效的.  相似文献   

7.
针对孔隙介质中地下水流动问题提出了一种并行数值计算方法,并基于此设计了一套专用于求解大规模三维地下水流动方程的并行计算模块。计算模块基于区域分解的方法实现对模型区域的并行求解,采用了分布式内存和压缩矩阵技术解决大规模稀疏矩阵的存储及其计算,整合多种并行Krylov子空间方法和预条件子技术迭代求解大规模线性方程组。在Linux集群系统上进行了数值模拟实验,性能测试结果表明,程序具有良好的加速比和可扩展性。  相似文献   

8.
1.引 言 近些年来随着计算机尺度和复杂性的扩大,人们对计算机和计算方法提出了更高的要求.这表明了人们不仅需要高速的,大内存的并行计算机,而且需要有效的并行算法.为了适合并行计算。一些数值格式需要重新改造.然而确有一些数值格式本身具有并行特征,能够直接用于并行计算. 事实上,关于偏微分方程的有限差分格式都有这种情况,现在人们正尽力研究它们,并给出了一些方法[1,2].在这篇文章里,我们以如下问题为例:给出了一个实用的本性并行差分格式.此格式基于在子区域边界上用显式格式,内部用隐式格式,此格式有许多…  相似文献   

9.
泊松方程的数值解法在许多物理或者工程问题上得到广泛应用,但是由于大部分三维泊松方程的离散化格式不具有明显的并行性,实际中使用整体迭代的思想,这使得计算效率和稳定性受到了限制。摒弃了传统数值解法中整体迭代的思想,结合离散正弦变换理论(DST),基于27点四阶差分格式,将三维泊松方程求解算法在算法级进行修改和并行优化,把整个求解问题转化成多个独立的问题进行求解,稳定性和并行性能得到大幅提升。对于确定的离散化形式,可以使用同一套参数解决不同的泊松方程,大大提高了编程效率。基于共享存储并行模型实现了该算法,实验结果显示,对于给出的实例,新算法具有较好的加速效果,计算结果精度误差约为10e-5,在可接受范围内,并且计算精度随着维数的升高具有一定提升。  相似文献   

10.
针对二维并行约束Delaunay网格生成算法直接应用于三维条件下会导致人工边界产生过短边的问题,提出并实现了基于主从模式的三维并行约束Delaunay网格生成算法.首先对求解区域进行分解,通过交换人工边界面上的数据解决子区域间网格一致性问题;其次为每个人工边界面选定主从子区域,由主子区域产生边界面网格并发送,从子区域负责接收;最后采用贪心算法平衡各个子区域的通信负载,得到算法效率的提升.实验结果表明,该算法可以大规模并行生成边界一致四面体网格,具有较好的并行效率,并能够保证最终的网格质量.  相似文献   

11.

In this paper a parallel difference scheme based on Dufort-Frankel scheme and the classic implicit scheme for linear heat conduction equations is studied. In this procedure, the values at subdomain interfaces are calculated by using the Dufort-Frankel scheme, and then these values serve as Dirichlet boundary data for the implicit scheme in the subdomains. The weak necessary condition of the unconditional stability of the parallel difference scheme is proved. Numerical experiments indicates that the parallel difference scheme has good parallelism, and has better accuracy than the fully implicit scheme.  相似文献   

12.
《国际计算机数学杂志》2012,89(10):1295-1306
A finite difference domain decomposition algorithm (DDA) for solving the heat equation in parallel is presented. In this procedure, interface values between subdomains are calculated by the group explicit formula, whereas interior values of subdomains are determined by the classical implicit scheme. The stability and convergence for this DDA are proved. The stability bound of the procedure is derived to be eight times that of the classical explicit scheme. Though the truncation error at the interface is O(τ?+?h), L 2-error is proved to be O(τ?+?h 2). Numerical examples confirm the second-order convergence and indicate that the stability condition is sharp. A comparison of the numerical errors of this procedure with other known methods is also included.  相似文献   

13.
A program for the efficient parallel generation of tetrahedral meshes in a wide class of three-dimensional domains having a generalized cylindrical shape is presented. The applied mesh generation strategy is based on the decomposition of some 2D-reference domain into simply connected subdomains. By means of the reference triangulations of these subdomains the tetrahedral layers are built up in parallel. Adaptive grid controlling as well as nodal renumbering algorithms are involved. In the paper several examples are included to demonstrate both the capabilities of the program and the adequate handling with the implemented method of parallelization.  相似文献   

14.
In the parallel implementation of solution methods for parabolic problems one has to find a proper balance between the parallel efficiency of a fully explicit scheme and the need for stability and accuracy which requires some degree of implicitness. As a compromise a domain splitting scheme is proposed which is locally implicit on slightly overlapping subdomains but propagates the corresponding boundary data by a simple explicit process. The analysis of this algorithm shows that it has satisfactory stability and approximation properties and can be effectively parallelized. These theoretical results are confirmed by numerical tests on a transputer system.  相似文献   

15.
In most recent substructuring methods, a fundamental role is played by the coarse space. For some of these methods (e.g. BDDC and FETI-DP), its definition relies on a ‘minimal’ set of coarse nodes (sometimes called corners) which assures invertibility of local subdomain problems and also of the global coarse problem. This basic set is typically enhanced by enforcing continuity of functions at some generalized degrees of freedom, such as average values on edges or faces of subdomains. We revisit existing algorithms for selection of corners. The main contribution of this paper consists of proposing a new heuristic algorithm for this purpose. Considering faces as the basic building blocks of the interface, inherent parallelism, and better robustness with respect to disconnected subdomains are among features of the new technique. The advantages of the presented algorithm in comparison to some earlier approaches are demonstrated on three engineering problems of structural analysis solved by the BDDC method.  相似文献   

16.
The numerical investigation of the interaction of large, solid particles with fluids is an important area of research for many manufacturing processes. Such studies frequently lead to models that are very large and require the use of parallel solution techniques. This paper presents the results of a parallel implementation of a serial code for the direct numerical simulation of solid-liquid flows. The base code is a serial, arbitrary Lagrangian-Eulerian (ALE) formulation of the equations of motion, which views that particles as solid bodies are embedded into the flow domain. This particular model poses some interesting difficulties for domain decomposition type approaches for parallel solutions. In particular, it is not fully understood how the partitioning of the particles among the subdomains influences the performance of parallel solvers. We present several strategies for the partitioning of the solid particles, focusing on the effectiveness of these techniques in terms of parallel speedup and efficiency.  相似文献   

17.
The need for robust solutions for sets of nonlinear multivariate constraints or equations needs no motivation. Subdivision-based multivariate constraint solvers typically employ the convex hull and subdivision/domain clipping properties of the Bézier/B-spline representation to detect all regions that may contain a feasible solution. Once such a region has been identified, a numerical improvement method is usually applied, which quickly converges to the root. Termination criteria for this subdivision/domain clipping approach are necessary so that, for example, no two roots reside in the same sub-domain (root isolation).This work presents two such termination criteria. The first theoretical criterion identifies subdomains with at most a single solution. This criterion is based on the analysis of the normal cones of the multiviarates and has been known for some time. Yet, a computationally tractable algorithm to examine this criterion has never been proposed. In this paper, we present a dual representation of the normal cones as parallel hyperplanes over the unit hypersphere, which enables us to construct an algorithm for identifying subdomains with at most a single solution. Further, we also offer a second termination criterion, based on the representation of bounding parallel hyperplane pairs, to identify and reject subdomains that contain no solution.We implemented both algorithms in the multivariate solver of the IRIT solid modelling system and present examples using our implementation.  相似文献   

18.
Solution-domain-decomposition (SDD) method is formulated for solving heat transfer problem and generalized for solving multi-domain problem. A generalized algorithm is suggested for parallel and distributing computation. Chebyshev expansion on the dependent variables is used for pseudospectral approximation of the governing equation in this study. Linear superposition principle is adapted to incorporate the interactions between the subdomains. By effective subdivision of computational domain, significant computational efficiency and computational memory savings are accomplished without losing spectral accuracy of the solution. Owing to independent characteristics of the subdomains. the scheme is well suited for multi-processor machines. Convergence study reveals that spectra! accuracy is still conserved for the multi-domain calculation. The calculation domain is divided up to 8 subdomains and calculation is distributed up to independent CPUs. Significant speed-up ratio is obtained by distributing the subtasks through the network.  相似文献   

19.
The standard BDDC (balancing domain decomposition by constraints) preconditioner is shown to be equivalent to a preconditioner built from a partially subassembled finite element model. This results in a system of linear algebraic equations which is much easier to solve in parallel than the fully assembled model; the cost is then often dominated by that of the problems on the subdomains. An important role is also played, both in theory and practice, by an averaging operator and in addition exact Dirichlet solvers are used on the subdomains in order to eliminate the residual in the interior of the subdomains. The use of inexact solvers for these problems and even the replacement of the Dirichlet solvers by a trivial extension are considered. It is established that one of the resulting algorithms has the same eigenvalues as the standard BDDC algorithm, and the connection of another with the FETI-DP algorithm with a lumped preconditioner is also considered. Multigrid methods are used in the experimental work and under certain assumptions, it is established that the iteration count essentially remains the same as when exact solvers are used, while considerable gains in the speed of the algorithm can be realized since the cost of the exact solvers grows superlinearly with the size of the subdomain problems while the multigrid methods are linear.  相似文献   

20.
To solve boundary value problems with moving fronts or sharp variations, moving mesh methods can be used to achieve reasonable solution resolution with a fixed, moderate number of mesh points. Such meshes are obtained by solving a nonlinear elliptic differential equation in the steady case, and a nonlinear parabolic equation in the time-dependent case. To reduce the potential overhead of adaptive partial differential equation-(PDE) based mesh generation, we consider solving the mesh PDE by various alternating Schwarz domain decomposition methods. Convergence results are established for alternating iterations with classical and optimal transmission conditions on an arbitrary number of subdomains. An analysis of a colouring algorithm is given which allows the subdomains to be grouped for parallel computation. A first result is provided for the generation of time-dependent meshes by an alternating Schwarz algorithm on an arbitrary number of subdomains. The paper concludes with numerical experiments illustrating the relative contraction rates of the iterations discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号