首页 | 本学科首页   官方微博 | 高级检索  
     

LU分解和Laplace算法在GPU上的实现
引用本文:陈颖,林锦贤,吕暾. LU分解和Laplace算法在GPU上的实现[J]. 计算机应用, 2011, 31(3): 851-855. DOI: 10.3724/SP.J.1087.2011.00851
作者姓名:陈颖  林锦贤  吕暾
作者单位:1. 福州大学 数学与计算机科学学院,福州3501082. 福州大学 数学与计算机科学学院,福州350108; 福州大学 福建省超级计算中心,福州3501083. 福州大学 福建省超级计算中心,福州350108;福州大学 生物科学与工程学院,福州350108
基金项目:福建省高校科研专项重点项目,福建省科技厅青年人才基金资助项目
摘    要:随着图形处理器(GPU)性能的大幅度提升以及可编程性的发展,已经有许多算法成功地移植到GPU上.LU分解和Laplace算法是科学计算的核心,但计算量往往很大,由此提出了一种在GPU上加速计算的方法.使用Nvidia公司的统一计算设备架构(CUDA)编程模型实现这两个算法,通过对CPU与GPU进行任务划分,同时利用GP...

关 键 词:图形处理器  LU分解  Laplace算法  统一计算设备架构  共享存储器
收稿时间:2010-09-06
修稿时间:2010-10-27

Implementation of LU decomposition and Laplace algorithms on GPU
CHEN Ying,LIN Jin-xian,L Tun. Implementation of LU decomposition and Laplace algorithms on GPU[J]. Journal of Computer Applications, 2011, 31(3): 851-855. DOI: 10.3724/SP.J.1087.2011.00851
Authors:CHEN Ying  LIN Jin-xian  L Tun
Affiliation:1. College of Mathematics and Computer Science, Fuzhou University, Fuzhou Fujian 350108, China2. College of Mathematics and Computer Science, Fuzhou University, Fuzhou Fujian 350108, China; Fujian Supercomputing Center, Fuzhou University, Fuzhou Fujian 350108, China3. Fujian Supercomputing Center, Fuzhou University, Fuzhou Fujian 350108, China; College of Biological Science and Technology, Fuzhou University, Fuzhou Fujian 350108, China
Abstract:With the advancement of Graphics Processing Unit (GPU) and the creation of its new feature of programmability, many algorithms have been successfully transferred to GPU. LU decomposition and Laplace algorithms are the core in scientific computation, but computation is usually too large; therefore, a speedup method was proposed. The implementation was based on Nvidia's GPU which supported Compute Unified Device Architecture (CUDA). Dividing tasks on CPU and GPU, using shared memory on GPU to increase the speed of data access, eliminating the branch in GPU program and stripping the matrix were used to speed up the algorithms. The experimental results show that with the size of matrix increasing, the algorithm based on GPU has a good speedup compared with the algorithm based on CPU.
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号