Derivation of optimal input parameters for minimizing execution time of matrix-based computations on a GPU |
| |
Affiliation: | 1. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;2. School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang 050018, China;1. Pacific Northwest National Laboratory Richland, WA 99354, USA;2. Pacific Northwest National Laboratory Seattle, WA 98109, USA;3. NVIDIA Research Santa Clara, CA 95051, USA;1. EECS, University of Central Florida, Orlando, United States;2. Department of Computer Science, Virginia Tech, Blacksburg, VA 2406, United States;1. Dept. of Computer Science and Engineering, Konkuk University, Seoul, Republic of Korea;2. Center for Experimental Research in Computer Systems, Georgia Institute of Technology, Atlanta, GA, USA;1. MCS, Argonne National Laboratory, Argonne, IL 60439, USA;2. DISCA, Universitat Politècnica de València, 46022 Valencia, Spain;3. DICC, Universitat Jaume I, 12071 Castellón, Spain |
| |
Abstract: | As GPUs are continually being utilized as coprocessors, the demand for optimally utilizing them for various computations continues to grow. The goal of this work is to derive input parameters which yield the minimum execution time for matrix-based computations executing on a GPU. Input parameters are defined as the dimensions of the grid and blocks assigned for execution on the GPU. Since input parameters inadequately represent the executional behavior of the GPU, execution metrics are formulated as functions of the input parameters to represent the behavior. The execution metrics are architecture independent and are utilized to derive optimal input parameters, which are input parameters that yield the minimum execution time. Optimal input parameters are derived for the following matrix-based computations: matrix–vector multiplication (Mv), matrix–matrix multiplication (MM), and convolution. The derivation allows for selection of optimal input parameters without executing code. Results, for all matrix-based computations and sizes tested, show that utilizing the derived optimal input parameters often yields the minimum execution time, and, at worst, execution time within 13.6% of the minimum. |
| |
Keywords: | GPU Execution time Matrix-based computations Input parameters |
本文献已被 ScienceDirect 等数据库收录! |
|