期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

胡家赣刘兴平《数值计算与计算机应用》1991,(3)

§1.引言众所周知,很多实际问题最后常需解一个或一些大型稀疏系数矩阵的线性代数方程组,对此一般都采用迭代法求解。对迭代法来说,收敛速度问题是一个关键问题。以往考察某些迭代法的收敛速度,常以正方形上Laplace方程或Poisson方程边值问题的通常五点差分格式(中心差分格式)为例,求出迭代矩阵的谱半径来加以比较。如JacobiGauss-Seidel和用最佳松弛因子ω_b的SOR方法(下面分别记为 J.GS 和SOR(ω_b))及相似文献

2.

三维对流扩散方程的稀疏存储及预条件迭代

下载免费PDF全文

袁冬芳曹富军《计算机工程与应用》2018,54(4):56-59

基于四阶紧致格式对三维对流扩散方程进行离散,并给出所得到的离散线性方程组的块三角稀疏矩阵形式。以带双阈值的不完全因子化LU分解[(ILUT(τ,s))]作为预条件子,分别用FGMRES、BICGSTAB和TFQMR作为迭代加速器,对离散线性方程组进行求解验证了格式精度并比较了不同迭代法的CPU时间和迭代步。此外,通过比较传统迭代法和预条件迭代法的计算效率,表明预条件迭代法不仅能够保证格式的四阶精度,还能极大地提高收敛效率。相似文献

3.

对称逐步超松弛预处理共轭梯度法的改进迭代格式 总被引：13，自引：0，他引：13

林绍忠《数值计算与计算机应用》1997,(4)

51．引言线性方程组的求解方法可分为两大类：直接法和迭代法．对于大型问题，当系数矩阵为条件数较小的稀疏矩阵且右端项不多时，迭代法的求解效率高．尽管迭代法多种多样，但其迭代收敛速度毫不例外地取决于迭代矩阵的条件数，而预处理的唯一目的就是降低迭代矩阵的条件数，从而达到减少迭代次数和计算量的目的．共轭梯度法（CG法）具有许多内在的优点，如有限步收敛性质．在实际计算中，由于舍入误差的影响，特别是由于系数矩阵的条件数常常较大，CG法往往出现收敛慢的问题．预处理共轭梯度法（PCG法）就是在共轭梯度法中采用了预处理… 相似文献

4.

GPU稀疏矩阵向量乘的性能模型构造

尹孟嘉许先斌何水兵胡婧叶从欢张涛《计算机科学》2017,44(4):182-187, 206

稀疏矩阵向量乘(Sparse matrix-vector multiplication,SPMV)是广泛应用于大规模线性求解系统和求解矩阵特征值等问题的基本运算,但在迭代处理过程中它也常常成为处理的瓶颈,影响算法的整体性能。对于不同形态的矩阵,选择不同的存储格式 ,对应的算法往往会产生较大的性能影响。通过实验分析,找到各种矩阵形态在不同存储结构下体现的性能变化特征,构建一个有效的性能度量模型,为评估稀疏矩阵运算开销、合理选择存储格式做出有效的指导。在14组CSR,COO,HYB格式和8组ELL格式的测试用例下,性能预测模型和测量之间的差异低于9%。相似文献

5.

一类四阶微积分方程的四阶差分格式

蔡耀雄任全伟庄清渠《数值计算与计算机应用》2014,(1):59-68

针对由吊桥模型而建立的四阶微积分方程,提出了四阶差分格式进行求解．对线性项采用紧格式进行离散,积分项则采用复化辛普森求积公式处理,再结合Newton型迭代法对方程进行求解．给出了差分格式解的存在性和收敛性的证明．数值结果表明格式的精度为O（h4）．相似文献

6.

求解鞍点问题的一般加速超松弛方法 总被引：2，自引：0，他引：2

邵新慧沈海龙李长军张铁《数值计算与计算机应用》2006,27(4):241-248

针对大型稀疏鞍点问题给出了一种含有待定参数的新迭代解法,将其称之为一般加速松弛方法,简记为GAOR方法．当参数α=时,新迭代方法是变成由Golub等人给出的SOR-Like方法．该迭代法的构成是基于对系数矩阵进行的一种分裂．迭代法需要选择一个预处理矩阵和待定参数,通过适当选取预处理矩阵和待定参数,新迭代法是收敛的,并且以定理的形式给出了新迭代方法的迭代矩阵的特征值和参数之间的基本等式,从而也导出了迭代法收敛的充分和必要条件．理论结果表明新方法更具有广泛性,并且适当的选择参数可以使新方法较SOR-Like方法具有更快的收敛速度．在文中的最后给出了迭代法的数值试验结果．相似文献

7.

一种针对GPU上的油藏数值模拟的高效SpMV

《数值计算与计算机应用》2016,(4)

油藏数值模拟和很多其他科学计算问题一样需要求解大型稀疏线性代数方程组.在求解稀疏线性代数方程组的迭代法中,稀疏矩阵向量乘法(SpMV)是影响计算效率的核心函数之一.随着计算机硬件架构异构化,科学计算从单核、多核CPU计算架构逐渐发展到多核CPU+众核加速卡(GPU卡或MIC等)的计算架构.SpMV的实现效率与稀疏矩阵的存储格式及硬件架构关系密切.本文针对油藏模拟中常见的Jacobian矩阵的稀疏模式,利用GPU核心的合并访问和并发计算等特点,结合油藏模拟线性解法器的算法要求,设计了一种BHYB矩阵存储格式及其对应的线程组并行策略.数值实验测得基于该存储格式的SpMV相对串行BCSR格式的SpMV的加速比可达19倍,比cuSPARSE库中效率最高的HYB格式的SpMV快30%到80%.此外,本文所提出的BHYB存储格式对块状矩阵在GPU上的存储以及线程组并行策略对其它GPU并行程序中内核函数的设计和优化能起到一定的借鉴作用. 相似文献

8.

基于GaBP算法的快速潮流计算方法

扈诗扬汪芳宗《计算技术与自动化》2016,(4):76-80

研究在潮流迭代求解过程中雅可比矩阵方程组的迭代求解方法及其收敛性。首先利用PQ分解法进行潮流迭代求解,并针对求解过程中雅可比矩阵对称且对角占优的特性,对雅可比矩阵方程组采用高斯置信传播算法(GaBP)进行求解,再结合Steffensen加速迭代法以提高GaBP算法的收敛性。对IEEE118、IEEE300节点标准系统和两个波兰互联大规模电力系统进行仿真计算后结果表明：随着系统规模的增长,使用Steffensen加速迭代法进行加速的GaBP算法相对于基于不完全LU的预处理广义极小残余方法(GMRES)具有更好的收敛性,为大规模电力系统潮流计算的快速求解提供了一种新思路。相似文献

9.

求解鞍点问题的广义正定和反Hermitian分裂方法

董贝贝鲍亮《计算机工程与科学》2019,41(9):1567-1573

探讨了如何求解大型稀疏鞍点问题,给出了一种基于正定分裂的广义正定和反Hermitian分裂(GPSS)方法。该方法首先利用矩阵的正定分裂,构造出鞍点矩阵的2种分裂格式;然后利用这2种分裂格式构造出GPSS迭代;接着给出了迭代收敛的充要条件。最后进行了数值对比实验,实验结果表明,GPSS比正定和反Hermitian分裂(PSS)和Hermitian和反Hermitian分裂(HSS)方法更有效。相似文献

10.

DRM:基于迭代归并策略的GPU并行SpMV存储格式

王宇华何俊飞张宇琪徐悦竹崔环宇《计算机工程与科学》2024,(3):381-394

稀疏矩阵向量乘(SpMV)在线性系统的求解问题中具有重要意义,是科学计算和工程实践中的核心问题之一,其性能高度依赖于稀疏矩阵的非零分布。稀疏对角矩阵是一类特殊的稀疏矩阵,其非零元素按照对角线的形式密集排列。针对稀疏对角矩阵,在GPU平台上提出的多种存储格式虽然使SpMV性能有所提升,但仍存在零填充和负载不平衡的问题。针对上述问题,提出了一种DRM存储格式,利用基于固定阈值的矩阵划分策略和基于迭代归并的矩阵重构策略,实现了少量零填充和块间负载平衡。实验结果表明,在NVIDIA^? Tesla^? V100平台上,相比于DIA、HDC、HDIA和DIA-Adaptive格式,在时间性能方面,该存储格式分别取得了20.76,1.94,1.13和2.26倍加速;在浮点计算性能方面,分别提高了1.54,5.28,1.13和1.94倍。相似文献

11.

A general iterative sparse linear solver and its parallelization for interval Newton methods

Chenyi Hu Anna Frolov R. Baker Kearfott Qing Yang 《Reliable Computing》1995,1(3):251-263

Interval Newton/Generalized Bisection methods reliably find all numerical solutions within a given domain. Both computational complexity analysis and numerical experiments have shown that solving the corresponding interval linear system generated by interval Newton's methods can be computationally expensive (especially when the nonlinear system is large). In applications, many large-scale nonlinear systems of equations result in sparse interval jacobian matrices. In this paper, we first propose a general indexed storage scheme to store sparse interval matrices We then present an iterative interval linear solver that utilizes the proposed index storage scheme It is expected that the newly proposed general interval iterative sparse linear solver will improve the overall performance for interval Newton/Generalized bisection methods when the jacobian matrices are sparse. In section 1, we briefly review interval Newton's methods. In Section 2, we review some currently used storage schemes for sparse systems. In Section 3, we introduce a new index scheme to store general sparse matrices. In Section 4, we present both sequential and parallel algorithms to evaluate a general sparse Jacobian matrix. In Section 5, we present both sequential and parallel algorithms to solve the corresponding interval linear system by the all-row preconditioned scheme. Conclusions and future work are discussed in Section 6. 相似文献

12.

A new storage scheme for an efficient implementation of the sparse matrix-vector product

P. Fernandes P. Girdinio 《Parallel Computing》1989,12(3):327-333

The most critical point in the implementation on a vector computer of iterative methods to solve sparse linear algebraic systems is the sparse matrix-vector product. Therefor we have compared its performance under three different sparse matrix storage schemes found in literature. These schemes have been tested on a FPS M64/330 using matrices similar to those arising in finite element or finite difference discretizations.

The results are reported and discussed. On the basis of them a new storage scheme is proposed and tested in similar situations. The obtained performance is never worse and often much better than those of the other schemes we have analyzed. 相似文献

13.

Incomplete block-factorization preconditioners for solving three-dimensional elliptic difference equations on systolic processors ∗ ∗∗

《国际计算机数学杂志》2012,89(1-4):341-364

An architecture of a parallel computing system for matrix computations based on a systolic array of processors is considered and a version of incomplete block-factorizations of sparse matrices that arise in finite difference solution of three dimensional elliptic differential equations of second order is studied. The parallelization, both of factorization and solution processes, is investigated and the implementation of a stationary preconditioned iterative method on the considered computer architecture is described. Representative numerical tests for the efficiency of the consecutive version of the method for solving three-dimensional finite difference approximations to the Poisson equation are presented. 相似文献

14.

A fast quasi-multiple medium method for 3-D bem calculation of parasitic capacitance 总被引：2，自引：0，他引：2

Wenjian Yu Zeyi Wang 《Computers & Mathematics with Applications》2003,45(12):1883-1894

A quasi-multiple medium (QMM) method is proposed to accelerate the boundary element method (BEM) for the 3-D parasitic capacitance calculation. In the QMM method, a homogeneous dielectric is decomposed into a number of fictitious medium blocks, each with the same permittivity of original medium. By the localization character of BEM, the QMM method makes great sparsity to the coefficient matrix of the overall discretized BEM equations. Then, using storing technique of sparse matrix and iterative equation solvers, the sparsity is explored to greatly reduce CPU time and memory usage of BEM computation. The computational complexity of the QMM accelerated BEM for a single-medium model problem is analyzed, and it is concluded as O(N), if the number of iterations is bounded. Numerical results verify the theoretical analysis and show the accelerating efficiency of the QMM method for calculation of 3-D parasitic capacitance. 相似文献

15.

Parallel Solutions for Large-Scale General Sparse Nonlinear Systems of Equations

下载免费PDF全文

HU Chengyi 《计算机科学技术学报》1996,11(3):257-271

In solving application problems,many large-scale nonlinear systems of equaions result in sparse Jacobian matrices.Such nonlinear systems are called sparse nonlinear systems.The irregularity of the locations of nonzrero elements of a general sparse matrix makes it very difficult to generally map sparse matrix computations to multiprocessors for parallel processing in a well balanced manner.To overcome this difficulty,we define a new storage scheme for general sparse matrices in this paper,With the new storage scheme,we develop parallel algorithms to solve large-scale general sparse systems of equations by interval Newton/Generalized bisection methods which reliably find all numerical solutions within a given domain.I n Section 1,we provide an introduction to the addressed problem and the interval Newton‘s methods.In Section 2,some currently used storage schemes for sparse systems are reviewed.In Section 3,new index schemes to store general sparse matrices are reported.In Section 4,we present a parallel algorithm to evaluate a general sparse Jacobian matrix.In Section 5,we present a parallel algorithm to solve the corresponding interval linear system by the all-row preconditioned scheme.Conclusions and future work are discussed in Section 6. 相似文献

16.

大型稀疏线性方程组符号LU分解法

张永杰孙秦《计算机工程与应用》2007,43(28):29-30

基于有限元总刚矩阵的大规模稀疏性、对称性等特性,采用全稀疏存储结构以及最小填入元算法,使得计算机的存储容量达到最少。为了节省计算机的运算时间,对总刚矩阵进行符号LU分解方法,大大减少了数值求解过程中的数据查询。这种全稀疏存储结构和符号LU分解相结合的求解方法,使大规模稀疏线性化方程组的求解效率大大提高。数值算例证明该算法在时间和存贮上都较为占优,可靠高效,能够应用于有限元线性方程组的求解。相似文献

17.

FPGA architecture and implementation of sparse matrix-vector multiplication for the finite element method

Yousef Elkurdi Evgueni Souleimanov Warren J. Gross 《Computer Physics Communications》2008,178(8):558-570

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory. 相似文献

18.

An incomplete factorization iterative technique for elliptic equations

《国际计算机数学杂志》2012,89(1-4):245-259

This paper describes efficient iterative techniques for solving the large sparse symmetric linear systems that arise from application of finite difference approximations to self-adjoint elliptic equations. We use an incomplete factorization technique with the method of D'Yakonov type, generalized conjugate gradient and Chebyshev semi-iterative methods. We compare these methods with numerical examples. Bounds for the 4-norm of the error vector of the Chebyshev semi-iterative method in terms of the spectral radius of the iteration matrix are derived. 相似文献

19.

Solving sparse linear systems of equations using the modified digraph approach

《国际计算机数学杂志》2012,89(8):919-929

We propose two different algorithms which depend on the modified digraph approach for solving a sparse system of linear equations. The main feature of the algorithms is that the solution of a sparse system of linear equations can be expressed exactly if all the non-zero entries, including the right-hand side, are integers and if none of the products exceeds the size of the largest integer that can be represented in the arithmetic of the computer used. The implementation of the algorithms is tested on five problems. The results are compared with those obtained using an algorithm proposed earlier. It is shown that the efficiency with which a sparse system of linear equations can be analysed by a digital computer using the proposed modified digraph approach as a tool depends mainly on the efficiency with which semifactors and k-semifactors are generated. Finally, in our implementation of the proposed algorithms, the input sparse matrix is stored using a row-ordered list of a modified uncompressed storage scheme. 相似文献