High performance computational kernels for selected segments of a p finite element code期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

High performance computational kernels for selected segments of a p finite element code

Authors:	E. Barragy R. Van De Geijn

Abstract:	A high performance implementation is presented for three kernel routines commonly found in element-byelement preconditioned conjugate gradient finite element codes. These routines include forming the element stiffness matrices and loading vectors, or in the case of a non-linear problem, element residual vectors; and routines for applying element matrix–vector products. The present study considers tensor product elements of arbitrary mapping in 2-D, although the generalization to triangular elements and serendipity elements is straightforward. The implementation presented is most appropriate for high p type finite element methods, where the element matrices are relatively large and dense. This results in a set of high performance kernels for superscalar architectures, which otherwise may be memory bandwidth limited. Performance studies are presented for a representative superscalar microprocessor, the Intel i860. As these types of microprocessors are at the heart of modern workstations as well as several parallel supercomputing systems, this work is relevant across a variety of platforms. The resulting kernels yield both high performance on a variety of sequential architectures as well as a high degree of code portability through the basic linear algebra subprograms mechanism.

Keywords:	BLAS finite element p method conjugate gradient

设为首页 | 免责声明 | 关于勤云 | 加入收藏