Efficient Concurrent L1-Minimization Solvers on GPUs期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Efficient Concurrent L1-Minimization Solvers on GPUs

Authors:	Xinyue Chu Jiaquan Gao Bo Sheng

Affiliation:	1 Jiangsu Key Laboratory for NSLSCS, School of Computer and Electronic Information, Nanjing Normal University, Nanjing 210023, China2 Department of Computer Science, University of Massachusetts Boston, MA 02125, USA

Abstract:	Given that the concurrent L1-minimization (L1-min) problem is often required in some real applications, we investigate how to solve it in parallel on GPUs in this paper. First, we propose a novel self-adaptive warp implementation of the matrix-vector multiplication (Ax) and a novel self-adaptive thread implementation of the matrix-vector multiplication (A^Tx), respectively, on the GPU. The vector-operation and inner-product decision trees are adopted to choose the optimal vector-operation and inner-product kernels for vectors of any size. Second, based on the above proposed kernels, the iterative shrinkage-thresholding algorithm is utilized to present two concurrent L1-min solvers from the perspective of the streams and the thread blocks on a GPU, and optimize their performance by using the new features of GPU such as the shuffle instruction and the read-only data cache. Finally, we design a concurrent L1-min solver on multiple GPUs. The experimental results have validated the high effectiveness and good performance of our proposed methods.

Keywords:	Concurrent L1-minimization problem dense matrix-vector multiplication fast iterative shrinkage-thresholding algorithm CUDA GPUs

	点击此处可从《计算机系统科学与工程》浏览原始摘要信息
	点击此处可从《计算机系统科学与工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏