首页 | 本学科首页   官方微博 | 高级检索  
     


Computations with symmetric, positive definite and band matrices on a parallel vector processor
Authors:Zahari Zlatev
Affiliation:

Air Pollution Laboratory, Danish Agency of Environmental Protection, Risø National Laboratory, DK-4000, Roskilde, Denmark

Mathematical Software Group, CRAY Research Inc., 1345 Northland Drive, Mendota Heights, MN 55120, U.S.A.

Danish Computing Centre for Research and Education, UNI * C, Region Lyngby, DK-2800, Lyngby, Denmark

Department of Chemical Physics, The H.C. Ørsted Institute, University of Copenhagen, DK-2100, Copenhagen Ø, Denmark

Abstract:Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops are vectorized but the speed of computations, measured in Mflops, is not very high, because the arrays involved are normally short. Therefore the computations should be organized by diagonals. Furthermore, some special devices are to be applied in order to unrol the loops. Finally, one should be careful with the storage scheme. It is demonstrated that if (i) the computations are organized by diagonals, (ii) the main loops are unrolled and (iii) the storage scheme is such that the work with some zero-elements is avoided, then the speed of computations is nearly the same as that obtained in the computations with dense matrices. If a particular vector machine is in use (in our case a CRAY X-MP computer), then the speed can be increased further by (iv) coding some basic operations in machine language and (v) using the different processors of the vector computer in parallel. The efficiency of the exploitation of the special features of the particular computer that is to be used is also illustrated by numerical examples.

Kernel subroutines performing matrix-vector multiplications are described. Representative tests are used to demonstrate the efficiency of these kernels.

Keywords:Linear algebra operations  conjugate gradients algorithms  vectorization  machine language  microtasking  symmetric matrices  positive definite matrices  band matrices  preconditioning  speed of computations on a vector machine
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号