首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, based on the preconditioners presented by Rees and Greif [T. Rees, C. Greif, A preconditioner for linear systems arising from interior point optimization methods, SIAM J. Sci. Comput. 29 (2007) 1992-2007], we present a new block triangular preconditioner applied to the problem of solving linear systems arising from finite element discretization of the mixed formulation of the time-harmonic Maxwell equations (k=0) in electromagnetic problems, since linear systems arising from the corresponding equations and methods have the same matrix block structure. Similar to spectral distribution of the preconditioners presented by Rees and Greif, this paper analyzes the corresponding spectral distribution of the new preconditioners considered in this paper. From the views of theories and applications, the presented preconditioners are as efficient as the preconditioners presented by Rees and Greif to apply. Moreover, numerical experiments are also reported to illustrate the efficiency of the presented preconditioners.  相似文献   

2.
In this paper, we study the effect of the choice of mesh quality metric, preconditioner, and sparse linear solver on the numerical solution of elliptic partial differential equations (PDEs). We smooth meshes on several geometric domains using various quality metrics and solve the associated elliptic PDEs using the finite element method. The resulting linear systems are solved using various combinations of preconditioners and sparse linear solvers. We use the inverse mean ratio and radius ratio metrics in addition to conditioning-based scale-invariant and interpolation-based size-and-shape metrics. We employ the Jacobi, SSOR, incomplete LU, and algebraic multigrid preconditioners and the conjugate gradient, minimum residual, generalized minimum residual, and bi-conjugate gradient stabilized solvers. We focus on determining the most efficient quality metric, preconditioner, and linear solver combination for the numerical solution of various elliptic PDEs with isotropic coefficients. We also investigate the effect of vertex perturbation and the effect of increasing the problem size on the number of iterations required to converge and on the solver time. In this paper, we consider Poisson’s equation, general second-order elliptic PDEs, and linear elasticity problems.  相似文献   

3.
The efficient solution of block tridiagonal linear systems arising from the discretization of convection–diffusion problem is considered in this paper. Starting with the classical nested factorization, we propose a relaxed nested factorization preconditioner. Then, several combination preconditioners are developed based on relaxed nested factorization and a tangential filtering preconditioner. Influence of the relaxation parameter is numerically studied, the results indicate that the optimal relaxation parameter should be close to but less than 1. The number of iteration counts exhibit an extremely sensitive behaviour. This phenomena resembles the behaviour of relaxed ILU preconditioner. For symmetric positive-definite coefficient matrix, we also show that the proposed combination preconditioner is convergent. Finally, numerous test cases are carried out with both additive and multiplicative combinations to verify the robustness of the proposed preconditioners.  相似文献   

4.
大规模有限元刚度矩阵存储及其并行求解算法   总被引:1,自引:0,他引:1  
本文提出一种将有限元单元刚度矩阵直接集成压缩格式的总体刚度矩阵的方法,并针对其线性系统设计了预处理的重启动GMRES(m)并行求解器.集成方法使用了一个“关联结点”的数据结构,它用来记录网格中节点的关联信息,作为集成过程的中间媒介.这种方法能减少大量的存储空间,简单且高效.求解器分别使用Jacobi和稀疏近似逆(SPAI)预条件子.二维和三维弹性力学问题的数值试验表明,在二维情形下,SPAI预条件子具有很好的加速收敛效果和并行效率;在三维情形下,Jacobi预条件子更能减少迭代收敛时间.  相似文献   

5.
Block preconditioner with circulant blocks (BPCB) has been used for solving linear systems with block Toeplitz structure since 1992 [R. Chan, X. Jin, A family of block preconditioners for block systems, SIAM J. Sci. Statist. Comput. (13) (1992) 1218–1235]. In this new paper, we use BPCBs to general linear systems (with no block structure usually). The BPCBs are constructed by partitioning a general matrix into a block matrix with blocks of the same size and then applying T. Chan’s optimal circulant preconditioner [T. Chan, An optimal circulant preconditioner for Toeplitz systems, SIAM J. Sci. Statist. Comput. (9) (1988) 766–771] to each block. These BPCBs can be viewed as a generalization of T. Chan’s preconditioner. It is well-known that the optimal circulant preconditioner works well for solving some structured systems such as Toeplitz systems by using the preconditioned conjugate gradient (PCG) method, but it is usually not efficient for solving general linear systems. Unlike T. Chan’s preconditioner, BPCBs used here are efficient for solving some general linear systems by the PCG method. Several basic properties of BPCBs are studied. The relations of the block partition with the cost per iteration and the convergence rate of the PCG method are discussed. Numerical tests are given to compare the cost of the PCG method with different BPCBs.  相似文献   

6.
The Laplace–Beltrami system of nonlinear, elliptic, partial differential equations has utility in the generation of computational grids on complex and highly curved geometry. Discretization of this system using the finite-element method accommodates unstructured grids, but generates a large, sparse, ill-conditioned system of nonlinear discrete equations. The use of the Laplace–Beltrami approach, particularly in large-scale applications, has been limited by the scalability and efficiency of solvers. This paper addresses this limitation by developing two nonlinear solvers based on the Jacobian-Free Newton–Krylov (JFNK) methodology. A key feature of these methods is that the Jacobian is not formed explicitly for use by the underlying linear solver. Iterative linear solvers such as the Generalized Minimal RESidual (GMRES) method do not technically require the stand-alone Jacobian; instead its action on a vector is approximated through two nonlinear function evaluations. The preconditioning required by GMRES is also discussed. Two different preconditioners are developed, both of which employ existing Algebraic Multigrid (AMG) methods. Further, the most efficient preconditioner, overall, for the problems considered is based on a Picard linearization. Numerical examples demonstrate that these solvers are significantly faster than a standard Newton–Krylov approach; a speedup factor of approximately 26 was obtained for the Picard preconditioner on the largest grids studied here. In addition, these JFNK solvers exhibit good algorithmic scaling with increasing grid size.  相似文献   

7.
Iterative solvers and preconditioners are widely used for handling the linear system of equations arising from stochastic finite element method (SFEM) formulations, e.g. galerkin-based polynomial chaos (G-P-C) Expansion method. Especially, Preconditioned Conjugate Gradient (PCG) solver and the Incomplete Cholesky (IC) preconditioner are shown to be adequate choices within this context. In this study, approaches for the automated adjustment of the input parameters for these tools are to be introduced. The proposed algorithms aim to enable the use of the PCG solver and IC preconditioner in a black-box fashion. As a result, the requirement of the expertise for using these tools is removed to a certain extend. Furthermore, these algorithms can be used also for the implementation purposes of SFEM’s within general purpose software by increasing the ease of the use of these tools and hence leading to an improved user-comfort.  相似文献   

8.
GPU-accelerated preconditioned iterative linear solvers   总被引:1,自引:1,他引:0  
This work is an overview of our preliminary experience in developing a high-performance iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the advantages and difficulties encountered when deploying GPU technology to perform sparse linear algebra computations. Techniques for speeding up sparse matrix-vector product (SpMV) kernels and finding suitable preconditioning methods are discussed. Our experiments with an NVIDIA TESLA M2070 show that for unstructured matrices SpMV kernels can be up to 8 times faster on the GPU than the Intel MKL on the host Intel Xeon X5675 Processor. Overall performance of the GPU-accelerated Incomplete Cholesky (IC) factorization preconditioned CG method can outperform its CPU counterpart by a smaller factor, up to 3, and GPU-accelerated The incomplete LU (ILU) factorization preconditioned GMRES method can achieve a speed-up nearing 4. However, with better suited preconditioning techniques for GPUs, this performance can be further improved.  相似文献   

9.
块三对角矩阵局部块分解及其在预条件中的应用   总被引:3,自引:1,他引:3  
该文利用块三对阵角阵分解因子的估值分析了其局部依赖性,并用其构了一类不完全分解型预条件子,给出了五点差分矩阵预条件后的条件数估计,并比较了条件数估计值与实际值,表明了估计值的准确性与预备件的有效性,在具体实现时,考虑了预条件的6个串行实现方案并提出了一个有效的并行化方法,该并行算法具有通信量少的特点,最后在由4中微机通过高速以太网连成的机群系统上作了大量数值实验,并将其与其它较效的预条件方法进行了。结果表明该预条件方法效果较好,尤其适用于并行计算。  相似文献   

10.
In this paper, we study the effect of enhancing GPU-accelerated Krylov solvers with preconditioners. We consider the BiCGSTAB, CGS, QMR, and IDR(s) Krylov solvers. For a large set of test matrices, we assess the impact of Jacobi and incomplete factorization preconditioning on the solvers’ numerical stability and time-to-solution performance. We also analyze how the use of a preconditioner impacts the choice of the fastest solver.  相似文献   

11.
测试并分析了高性能预条件库HYPRE的多重网格解法器SMG和BoomerAMG在某国产大规模并行机数千个处理器上的可扩展性能,得到若干对线性解法器算法研究和并行实现技术发展具有启示性意义的结论.这些结论对实际复杂物理系统数值模拟中线性解法器的应用和发展具有一定的指导意义.  相似文献   

12.
首先对含跳系数的H~1型和H(curl)型椭圆问题的线性有限元方程,分别设计了基于AMG预条件子和基于节点辅助空间预条件子(HX预条件子)的PCG法.数值实验表明,算法的迭代次数基本不依赖于系数跳幅和离散网格"尺寸".然后以此为基础,对Maxwell方程组鞍点问题的第一类N(e)d(e)lec线性棱元离散系统设计并分析了一种基于HX预条件子的Uzawa算法.当系数光滑时,理论上证明了算法的收敛率与网格规模无关.数值实验表明,新算法对跳系数情形也是高效和稳定的.  相似文献   

13.
稀疏近似逆预条件子及其并行计算   总被引:1,自引:0,他引:1  
文中使用范数极小技术,提出一种构造稀疏矩阵并行近似逆预条件子的方法,所构造的稀疏矩阵近似逆的稀疏结构和数据矩阵的转置矩阵相同,计算量和存储量上,其求解过程易于并行。且并行计算不影响其收敛效果。通过试算表明,该方法对很多问题的求解具有明显的加速效果。文中给出了该方法的并行算法,并提出了一种自适应分配算法来解决负载平衡问题。  相似文献   

14.
In this work, we analyze the scalability of inexact two-level balancing domain decomposition by constraints (BDDC) preconditioners for Krylov subspace iterative solvers, when using a highly scalable asynchronous parallel implementation where fine and coarse correction computations are overlapped in time. This way, the coarse-grid problem can be fully overlapped by fine-grid computations (which are embarrassingly parallel) in a wide range of cases. Further, we consider inexact solvers to reduce the computational cost/complexity and memory consumption of coarse and local problems and boost the scalability of the solver. Out of our numerical experimentation, we conclude that the BDDC preconditioner is quite insensitive to inexact solvers. In particular, one cycle of algebraic multigrid (AMG) is enough to attain algorithmic scalability. Further, the clear reduction of computing time and memory requirements of inexact solvers compared to sparse direct ones makes possible to scale far beyond state-of-the-art BDDC implementations. Excellent weak scalability results have been obtained with the proposed inexact/overlapped implementation of the two-level BDDC preconditioner, up to 93,312 cores and 20 billion unknowns on JUQUEEN. Further, we have also applied the proposed setting to unstructured meshes and partitions for the pressure Poisson solver in the backward-facing step benchmark domain.  相似文献   

15.
We conduct simulations for the 3D unsteady state anisotropic diffusion process with DT-MRI data in the human brain by discretizing the governing diffusion equation on Cartesian grid and adopting a high performance differential-algebraic equation (DAE) solver, the parallel version of implicit differential-algebraic (IDA) solver, to tackle the resulting large scale system of DAEs. Parallel preconditioning techniques including sparse approximate inverse and banded-block-diagonal preconditioners are used with the GMRES method to accelerate the convergence rate of the iterative solution. We then investigate and compare the efficiency and effectiveness of the two parallel preconditioners. The experimental results of the diffusion simulations on a parallel supercomputer show that the sparse approximate inverse preconditioning strategy, which is robust and efficient with good scalability, gives a much better overall performance than the banded-block-diagonal preconditioner.  相似文献   

16.
We study a conservative 5-point cell-centered finite volume discretization of the high-contrast diffusion equation. We aim to construct preconditioners that are robust with respect to the magnitude of the coefficient contrast and the mesh size simultaneously. For that, we prove and numerically demonstrate the robustness of the preconditioner proposed by Aksoylu et al. (Comput Vis Sci 11:319–331, 2008) by extending the devised singular perturbation analysis from linear finite element discretization to the above discretization. The singular perturbation analysis is more involved than that of finite element case because all the subblocks in the discretization matrix depend on the diffusion coefficient. However, as the diffusion coefficient approaches infinity, that dependence is eliminated. This allows the same preconditioner to be utilized due to similar limiting behaviours of the submatrices; leading to a narrowing family of preconditioners that can be used for different discretizations. Therefore, we have accomplished a desirable preconditioner design goal. We compare our numerical results to standard cell-centered multigrid implementations and observe that performance of our preconditioner is independent of the utilized smoothers and prolongation operators. As a side result, we also prove a fundamental qualitative property of solution of the high-contrast diffusion equation. Namely, the solution over the highly-diffusive island becomes constant asymptotically. Integration of this qualitative understanding of the underlying PDE to our preconditioner is the main reason behind its superior performance. Diagonal scaling is probably the most basic preconditioner for high-contrast coefficients. Extending the matrix entry based spectral analysis introduced by Graham and Hagger, we rigorously show that the number of small eigenvalues of the diagonally scaled matrix depends on the number of isolated islands comprising the highly-diffusive region. This indicates that diagonal scaling creates a significant clustering of the spectrum, a favorable property for faster convergence of Krylov subspace solvers.  相似文献   

17.
对Krylov子空间迭代法,高效预条件的构造是核心问题之一,而重叠区域分解是一种很有效的并行化技术。通过模型偏微分方程离散求解以及混凝土细观数值模拟中的线性方程组求解,对商图,就自然排序、RCM排序、Sloan排序、GPS排序、谱排序和随机排序等多种重排算法进行了比较。对子区域内顶点的重排方案,进行了自然排序、RCM排序、谱排序、随机排序和一种新排序算法间的比较。结果表明,预条件效果对商图排序不敏感。局部排序对预条件质量具有明显影响,局部采用随机排序时效果一般较差,而带宽缩减算法对加性Schwarz影响很小,对块Jacobi并行化预条件影响较大,对因子组合型并行预条件采用自然排序和新排序时效果较好。  相似文献   

18.
1.引言考虑求解线性方程组AX一b,X,bE*”,山其中A二(a;小_是大型稀疏非对称矩阵.通常使用迭代法求解式(1),如GMRESBICGSTAB,CGSTFQMRCGSZ等Kryl0V子空间迭代法.直接使用迭代法的收敛速度有时特别慢,或根本不收敛,需使用预条件以加速迭代法的收敛速度.通常使用左或右预条件子M使式(1)变成易于求解的形式*M9一6,X二M队或*AX二*6.由然后用迭代法求解式(2),M的选择要使得AM(或M则近似等于单位矩阵.构造预条件子的方法有很多,如不完全分解方法、SSOR方法、多项式方法等,不完全分解方法和SSOR…  相似文献   

19.
The sparse matrix vector product (SpMV) is a key operation in engineering and scientific computing and, hence, it has been subjected to intense research for a long time. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim of maximizing the performance. Graphics Processing Units (GPUs) have recently emerged as platforms that yield outstanding acceleration factors. SpMV implementations for NVIDIA GPUs have already appeared on the scene. This work proposes and evaluates a new implementation of SpMV for NVIDIA GPUs based on a new format, ELLPACK‐R, that allows storage of the sparse matrix in a regular manner. A comparative evaluation against a variety of storage formats previously proposed has been carried out based on a representative set of test matrices. The results show that, although the performance strongly depends on the specific pattern of the matrix, the implementation based on ELLPACK‐R achieves higher overall performance. Moreover, a comparison with standard state‐of‐the‐art superscalar processors reveals that significant speedup factors are achieved with GPUs. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
稀疏矩阵与向量相乘SpMV是求解稀疏线性系统中的一个重要问题,但是由于非零元素的稀疏性,计算密度较低,造成计算效率不高。针对稀疏矩阵存在的一些不规则性,利用混合存储格式来进行SpMV计算,能够提高对稀疏矩阵的压缩效率,并扩大其适应范围。HYB是一种广泛使用的混合压缩格式,其性能较为稳定。而随着GPU并行计算得到普遍应用以及CPU日趋多核化,因此利用GPU和多核CPU构建异构并行计算系统得到了普遍的认可。针对稀疏矩阵的HYB存储格式中的ELL和COO存储特征,把两部分数据分别分割到CPU和GPU进行协同并行计算,既能充分利用CPU和GPU的计算资源,又能够发挥CPU和GPU的计算特性,从而提高了计算资源的利用效能。在分析CPU+GPU异构计算模式的特征的基础上,对混合格式的数据分割和共享方面进行优化,能够较好地发挥在异构计算环境的优势,提高计算性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号