首页 | 本学科首页   官方微博 | 高级检索  
     


Improving the numerical stability and the performance of a parallel sparse solver
Authors:A. C. N. van DuinP. C. HansenTz. OstromskyH. WijshoffZ. Zlatev
Affiliation:

Leiden University, Leiden, The Netherlands

UNI·C, The Danish Computer Centre for Research and Education The Technical University of Denmark, Bldg. 304, DK-2800, Lyngby, Denmark

UNI·C, The Danish Computer Centre for Research and Education The Technical University of Denmark, Bldg. 304, DK-2800, Lyngby, Denmark

Leiden University, Leiden, The Netherlands

National Environmental Research Institute Frederiksborgvej 399, DK-4000, Roskilde, Denmark

Abstract:Coarse grain parallel codes for solving sparse systems of linear algebraic equations can be developed in several different ways. The following procedure is suitable for some parallel computers. A preliminary reordering of the matrix is first applied to move as many zero elements as possible to the lower left corner. After that the matrix is partitioned into large blocks and the blocks in the lower left corner contain only zero elements. An attempt to obtain a good load-balance is carried out by allowing the diagonal blocks to be rectangular.

While the algorithm based on the above ideas has good parallel properties, some stability problems may arise during the factorization because the pivotal search is restricted to the diagonal blocks. A simple a priori procedure has been used in a previous version in an attempt to stabilize the algorithm. In this paper it is shown that three enhanced stability devices can successfully be incorporated in the algorithm so that it is further stabilized and, moreover, the parallel properties of the original algorithm are preserved.

The first device is based on a dynamic check of the stability. In the second device a slightly modified reordering is used in an attempt to get more nonzero elements in the diagonal blocks (the number of candidates for pivots tends to increase in this situation and, therefore, there is a better chance to select more stable pivots). The third device applies a P5-like ordering as a secondary criterion in the basic reordering procedure. This tends to improve the reordering and the performance of the solver. Moreover, the device is stable, while the original P5 ordering is often unstable.

Numerical results obtained by using the three new devices are presented. The well-known sparse matrices from the Harwell-Boeing set are used in the experiments.

Keywords:Sparse matrix   General sparsity   Gaussian elimination   Drop-tolerance   Reordering   Block algorithm   Coarse-grain parallelism   Speed-up
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号