Computations with symmetric, positive definite and band matrices on a parallel vector processor期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Computations with symmetric, positive definite and band matrices on a parallel vector processor

Authors:	Zahari Zlatev

Affiliation:	Air Pollution Laboratory, Danish Agency of Environmental Protection, Risø National Laboratory, DK-4000, Roskilde, Denmark Mathematical Software Group, CRAY Research Inc., 1345 Northland Drive, Mendota Heights, MN 55120, U.S.A. Danish Computing Centre for Research and Education, UNI * C, Region Lyngby, DK-2800, Lyngby, Denmark Department of Chemical Physics, The H.C. Ørsted Institute, University of Copenhagen, DK-2100, Copenhagen Ø, Denmark

Abstract:	Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops are vectorized but the speed of computations, measured in Mflops, is not very high, because the arrays involved are normally short. Therefore the computations should be organized by diagonals. Furthermore, some special devices are to be applied in order to unrol the loops. Finally, one should be careful with the storage scheme. It is demonstrated that if (i) the computations are organized by diagonals, (ii) the main loops are unrolled and (iii) the storage scheme is such that the work with some zero-elements is avoided, then the speed of computations is nearly the same as that obtained in the computations with dense matrices. If a particular vector machine is in use (in our case a CRAY X-MP computer), then the speed can be increased further by (iv) coding some basic operations in machine language and (v) using the different processors of the vector computer in parallel. The efficiency of the exploitation of the special features of the particular computer that is to be used is also illustrated by numerical examples. Kernel subroutines performing matrix-vector multiplications are described. Representative tests are used to demonstrate the efficiency of these kernels.

Keywords:	Linear algebra operations conjugate gradients algorithms vectorization machine language microtasking symmetric matrices positive definite matrices band matrices preconditioning speed of computations on a vector machine
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏