首页 | 本学科首页   官方微博 | 高级检索  
     

基于自适应线程束的GPU并行粒子群优化算法
引用本文:张硕,何发智,周毅,鄢小虎.基于自适应线程束的GPU并行粒子群优化算法[J].计算机应用,2016,36(12):3274-3279.
作者姓名:张硕  何发智  周毅  鄢小虎
作者单位:武汉大学 计算机学院, 武汉 430072
基金项目:国家自然科学基金资助项目(61472289);湖北省自然科学基金资助项目(2015CFB254)。
摘    要:基于统一计算设备架构(CUDA)对图形处理器(GPU)下的并行粒子群优化(PSO)算法作改进研究。根据CUDA的硬件体系结构特点,可知Block是串行执行的,线程束(Warp)才是流多处理器(SM)调度和执行的基本单位。为了充分利用Block中线程的并行性,提出基于自适应线程束的GPU并行PSO算法:将粒子的维度和线程相对应;利用GPU的Warp级并行,根据维度的不同自适应地将每个粒子与一个或多个Warp相对应;自适应地将一个或多个粒子与每个Block相对应。与已有的粗粒度并行方法(将每个粒子和线程相对应)以及细粒度并行方法(将每个粒子和Block相对应)进行了对比分析,实验结果表明,所提出的并行方法相对前两种并行方法,CPU加速比最多提高了40。

关 键 词:粒子群优化算法  并行计算  图形处理器  统一计算设备架构  自适应线程束  
收稿时间:2016-06-03
修稿时间:2016-07-06

GPU parallel particle swarm optimization algorithm based on adaptive warp
ZHANG Shuo,HE Fazhi,ZHOU Yi,YAN Xiaohu.GPU parallel particle swarm optimization algorithm based on adaptive warp[J].journal of Computer Applications,2016,36(12):3274-3279.
Authors:ZHANG Shuo  HE Fazhi  ZHOU Yi  YAN Xiaohu
Affiliation:School of Computer, Wuhan University, Wuhan Hubei 430072, China
Abstract:The parallel Particle Swarm Optimization (PSO) algorithm was improved through Graphics Processor Unit (GPU) based on Compute Unified Device Architecture (CUDA). According to the structural characteristics of the CUDA hardware system, it can be concluded that block is executed serially and the basic scheduled and executive unit of Streaming Multiprocessor (SM) is warp. GPU parallel PSO algorithm based on adaptive warp was carried out in order to make full use of thread parallelism in the block. The dimensions of particles were corresponded to the threads of particles. Each particle was corresponded to one or more warps in accordance with its self-dimension adaptively by using the warp level parallelism of GPU. One or more particles were corresponded to each block. Comparison with the existing coarse-grained parallel approach (corresponding each particle to the thread) and fine-grained parallel approach (corresponding each particle to the block) was made, and the experimental results show that the proposed parallel approach achieves CPU speed-up ratio of 40 more than two kinds of approaches mentioned above.
Keywords:Particle Swarm Optimization (PSO) algorithm  parallel computing  Graphic Processing Unit (GPU)  Compute Unified Device Architecture (CUDA)  adaptive warp  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号