首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this work, the solution of a large sparse linear system of equations with an arbitrary sparsity pattern is obtained by using LU-decomposition method as well as numerical structure approach. The LU-decomposition method is based on Doolittle's method while the numerical structure approach is based on Cramer's rule. The numerical structure approach produces direct solution without facing fill-in problems as encountered in LU-decomposition. In order to reduce the ‘fill-ins’ in the decomposition, the powers of a Boolean matrix, obtained from the coefficient matrix A are taken so that the ‘fill-ins’ in the structure of A can be known in advance. The position of fill-ins in A are thus determined in the best choice manner, that is, it is very effective and memory-wise cheap. We also outline a method by using numerical structure with reduced computation efforts.Finally, experiments are performed on eight examples to compare the efficiency of the proposed methods. The results obtained are reported in a table. It is found that the LU-decomposition method is much better than numerical structure. The usefulness of numerical structure approach is also discussed.  相似文献   

2.
3.
We consider the iterative solution of large sparse linear systems of equations arising from elliptic and parabolic partial differential equations in two or three space dimensions. Specifically, we focus our attention on nonsymmetric systems of equations whose eigenvalues lie on both sides of the imaginary axis, or whose symmetric part is not positive definite. This system of equation is solved using a block Kaczmarz projection method with conjugate gradient acceleration. The algorithm has been designed with special emphasis on its suitability for multiprocessors. In the first part of the paper, we study the numerical properties of the algorithm and compare its performance with other algorithms such as the conjugate gradient method on the normal equations, and conjugate gradient-like schemes such as ORTHOMIN(k), GCR(k) and GMRES(k). We also study the effect of using various preconditioners with these methods. In the second part of the paper, we describe the implementation of our algorithm on the CRAY X-MP/48 multiprocessor, and study its behavior as the number of processors is increased.  相似文献   

4.
5.
研究了基于GPU的稀疏线性方程组的预条件共轭梯度法加速求解问题,并基于统一计算设备架构(CUDA)平台编制了程序,在NVIDIAGT430 GPU平台上进行了程序性能测试和分析。稀疏矩阵采用压缩稀疏行(CSR)格式压缩存储,针对预条件共轭梯度法的算法特性,研究了基于GPU的稀疏矩阵与向量相乘的性能优化、数据从CPU端传到GPU端的加速传输措施。将编制的稀疏矩阵与向量相乘的kernel函数和CUSPARSE函数库中的cusparseDcsrmv函数性能进行了对比,最优得到了2.1倍的加速效果。对于整个预条件共轭梯度法,通过自编kernel函数来实现的算法较之采用CUBLAS库和CUSPARSE库实现的算法稍具优势,与CPU端的预条件共轭梯度法相比,最优可以得到7.4倍的加速效果。  相似文献   

6.
Randomized algorithms are gaining ground in high-performance computing applications as they have the potential to outperform deterministic methods, while still providing accurate results. We propose a randomized solver for distributed multicore architectures to efficiently solve large dense symmetric indefinite linear systems that are encountered, for instance, in parameter estimation problems or electromagnetism simulations. The contribution of this paper is to propose efficient kernels for applying random butterfly transformations and a new distributed implementation combined with a runtime (PaRSEC) that automatically adjusts data structures, data mappings, and the scheduling as systems scale up. Both the parallel distributed solver and the supporting runtime environment are innovative. To our knowledge, the randomization approach associated with this solver has never been used in public domain software for symmetric indefinite systems. The underlying runtime framework allows seamless data mapping and task scheduling, mapping its capabilities to the underlying hardware features of heterogeneous distributed architectures. The performance of our software is similar to that obtained for symmetric positive definite systems, but requires only half the execution time and half the amount of data storage of a general dense solver.  相似文献   

7.
Recently developed block-iterative versions of some row-action algorithms for solving general systems of sparse linear equations allow parallelism in the computations when the underlying problem is appropriately decomposed. However, problems associated with the parallel implementation of these algorithms have to be addressed. In this paper we present an implementation on distributed memory multiprocessors of a block version of the Kaczmarz row-action method. One of the main issues related to the efficient implementation of this method on a concurrent environment is to develop suitable communication schemes in order to reduce the amount of communication needed at each iteration. We propose two data distribution strategies which lead to different computation and communication schemes. To verify and compare the effectiveness of the proposed strategies, numerical experiments have been carried out on a Symult S2010 and a Meiko Computing Surface. The performance evaluation has been done using a scaled efficiency model.  相似文献   

8.
A preliminary report is given on the ITPACK project for developing adaptive iterative algorithms and software for solving large sparse PDE-related linear systems of equations. These iterative procedures are implemented in a collection of Fortran subroutines.  相似文献   

9.
This paper is concerned with the application of preconditioning techniques to the well known Jacobi iterative method for solving the finite difference equations derived from the discretization of self-adjoint elliptic partial differential equations. The convergence properties of this one parameter preconditioned method are analyzed and the value of the optimum preconditioning parameter and the performance of the method determined for a variety of standard problems.  相似文献   

10.
In this paper, a new two-step iterative method called the two-step parameterized (TSP) iteration method for a class of complex symmetric linear systems is developed. We investigate its convergence conditions and derive the quasi-optimal parameters which minimize the upper bound of the spectral radius of the iteration matrix of the TSP iteration method. Meanwhile, some more practical ways to choose iteration parameters for the TSP iteration method are proposed. Furthermore, comparisons of the TSP iteration method with some existing ones are given, which show that the upper bound of the spectral radius of the TSP iteration method is smaller than those of the modified Hermitian and skew-Hermitian splitting (MHSS), the preconditioned MHSS (PMHSS), the combination method of real part and imaginary part (CRI) and the parameterized variant of the fixed-point iteration adding the asymmetric error (PFPAE) iteration methods proposed recently. Inexact version of the TSP iteration (ITSP) method and its convergence properties are also presented. Numerical experiments demonstrate that both TSP and ITSP are effective and robust when they are used either as linear solvers or as matrix splitting preconditioners for the Krylov subspace iteration methods and they have comparable advantages over some known ones for the complex symmetric linear systems.  相似文献   

11.
There are verities of useful Krylov subspace methods to solve nonsymmetric linear system of equations. GMRES is one of the best Krylov solvers with several different variants to solve large sparse linear systems. Any GMRES implementation has some advantages. As the solution of ill-posed problems are important. In this paper, some GMRES variants are discussed and applied to solve these kinds of problems. Residual smoothing techniques are efficient ways to accelerate the convergence speed of some iterative methods like CG variants. At the end of this paper, some residual smoothing techniques are applied for different GMRES methods to test the influence of these techniques on GMRES implementations.  相似文献   

12.
An algebraic multigrid method (AMG) for solving convection-diffusion optimality systems is presented. Results of numerical experiments demonstrate robustness of the AMG scheme with respect to changes of the weight of the cost of the control and show that the computational performance of the proposed AMG scheme is comparable to that of AMG applied to single scalar equations.  相似文献   

13.
14.
《Parallel Computing》1997,23(6):637-647
This paper considers the parallel solution of finite element equations using the preconditioned conjugate gradient method on shared memory multiprocessors. The preconditioner used is an incomplete LU factorization of the stiffness matrix. We have designed a new method for implementing parallel forward and backward substitutions which requires the L and U factors to have an independent column structure, which we refer to as I2LU. The algorithm has been implemented on an Encore Multimax and parallel efficiencies up to 80% of the maximum possible theoretical value have been obtained using around 6–8 processors. These figures are shown to be comparable with the efficiencies of an implementation of a similar row-oriented approach, based on level scheduling, tested with the same problems.  相似文献   

15.
U. Derigs  A. Metz 《Computing》1986,36(4):301-311
We describe a new implementation of the shortest augmenting path approach for solving sparse assignment problems and report computational experience documenting its efficiency.  相似文献   

16.
Jan Mayer 《Computing》2009,86(4):291-312
In this article, we present two new algorithms for solving given triangular systems in parallel on a shared memory architecture. Multilevel incomplete LU factorization based preconditioners, which have been very successful for solving linear systems iteratively, require these triangular solves. Hence, the algorithms presented here can be seen as parallelizing the application of these preconditioners. The first algorithm solves the triangular matrix by block anti-diagonals. The drawback of this approach is that it can be difficult to choose an appropriate block structure. On the other hand, if a good block partition can be found, this algorithm can be quite effective. The second algorithm takes a hybrid approach by solving the triangular system by block columns and anti-diagonals. It is usually as effective as the first algorithm, but the block structure can be chosen in a nearly optimal manner. Although numerical results indicate that the speed-up can be fairly good, systems with matrices having a strong diagonal structure or narrow bandwidth cannot be solved effectively in parallel. Hence, for these matrices, the results are disappointing. On the other hand, the results are better for matrices having a more uniform distribution of non-zero elements. Although not discussed in this article, these algorithms can possibly be adapted for distributed memory architectures.  相似文献   

17.
The Journal of Supercomputing - In a large number of scientific applications, the solution of sparse linear systems is the stage that concentrates most of the computational effort. This situation...  相似文献   

18.
《国际计算机数学杂志》2012,89(3-4):303-320
In this work we propose a direct method for solving systems of linear equations which is based on a successive LU-decomposition of matrices of the form l + uv T . Simultaneously, the factors of an LU-decomposition of the coefficient matrix are obtained. A specific choice of the “rank-one decomposition” of the given matrix leads to a variant of the Gauss elimination process.  相似文献   

19.
In this paper, we propose an extended block Krylov process to construct two biorthogonal bases for the extended Krylov subspaces \(\mathbb {K}_{m}^e(A,V)\) and \(\mathbb {K}_{m}^e(A^{T},W)\), where \(A \in \mathbb {R}^{n \times n}\) and \(V,~W \in \mathbb {R}^{n \times p}\). After deriving some new theoretical results and algebraic properties, we apply the proposed algorithm with moment matching techniques for model reduction in large scale dynamical systems. Numerical experiments for large and sparse problems are given to show the efficiency of the proposed method.  相似文献   

20.
A couple of approximate inversion techniques are presented which provide a parallel enhancement to several iterative methods for solving linear systems arising from the discretization of boundary value problems. In particular, the Jacobi, Gauss‐Seidel, and successive overrelaxation methods can be improved substantially in a parallel environment by the extensions considered. A special case convergence proof is presented. The use of our approximate inverses with the preconditioned conjugate gradient method is examined and comparisons are made with some recently proposed algorithms in this area that also employ approximate inverses. The methods considered are compared under sequential and parallel hardware assumptions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号