Affiliation: | a Dip. di Matematica Applicata ed Informatica, Università Ca' Foscari di Venezia, via Torino 155, 30173 Venezia Mestre, Italy b CNUCE - C.N.R., Consiglio Nazionale delle Ricerche, via Santa Maria 36, 56126 Pisa, Italy |
Abstract: | This paper presents SUPPLE (SUPort for Parallel Loop Execution), an innovative run-time support for the execution of parallel loops with regular stencil data references and non-uniform iteration costs. SUPPLE relies upon a static block data distribution to exploit locality, and combines static and dynamic policies for scheduling non-uniform iterations. It adopts, as far as possible, a static scheduling policy derived from the owner computes rule, and moves data and iterations among processors only if a load imbalance actually occurs. SUPPLE always tries to overlap communications with useful computations by reordering loop iterations and prefetching remote ones in the case of workload imbalance. The SUPPLE approach has been validated by many experimental results obtained by running a multi-dimensional flame simulation kernel on a 64-node Cray T3D. We have fed the benchmark code with several synthetic input data sets built on the basis of a load imbalance model. We have compared our results with those obtained with a CRAFT Fortran implementation of the benchmark. |