首页 | 本学科首页   官方微博 | 高级检索  
     

基于MIC集群平台的GMRES算法并行加速
引用本文:王明清,李明,张清,张广勇,吴韶华. 基于MIC集群平台的GMRES算法并行加速[J]. 计算机科学, 2017, 44(4): 197-201, 240
作者姓名:王明清  李明  张清  张广勇  吴韶华
作者单位:浪潮集团高效能服务器和存储技术国家重点实验室 济南250101,太原理工大学数学学院 太原030024,浪潮集团高效能服务器和存储技术国家重点实验室 济南250101,浪潮集团高效能服务器和存储技术国家重点实验室 济南250101,浪潮集团高效能服务器和存储技术国家重点实验室 济南250101
摘    要:广义极小残量法(GMRES)是最常用的求解非对称大规模稀疏线性方程组的方法之一,其收敛速度快且稳定性良好。Intel Xeon Phi众核协处理器(MIC)具有计算能力强、易编程、易移植等特点。采用MPI+OpenMP+offload混合编程模型将GMRES算法移植到MIC集群平台上。采用进程间集合通信异步隐藏、数据传输优化、向量化以及线程亲和性优化等多种手段,大幅提升了GMRES算法的求解效率。最后将并行算法应用到“局部径向基函数求解高维偏微分方程”问题的求解中。测试表明,CPU节点集群上开启32个进程,并行效率高达71.74%,4块MIC卡的最高加速性能可达单颗CPU的7倍。

关 键 词:广义极小残量法  MIC  MPI  大规模线性方程组
收稿时间:2016-02-27
修稿时间:2016-06-15

Speedup of GMRES Based on MIC Heterogeneous Cluster Platform
WANG Ming-qing,LI Ming,ZHANG Qing,ZHANG Guang-yong and WU Shao-hua. Speedup of GMRES Based on MIC Heterogeneous Cluster Platform[J]. Computer Science, 2017, 44(4): 197-201, 240
Authors:WANG Ming-qing  LI Ming  ZHANG Qing  ZHANG Guang-yong  WU Shao-hua
Affiliation:National Key Laboratory for High-efficient Server and Storage Technology,Inspur,Jinan 250101,China,School of Mathematics,Taiyuan University of Technology,Taiyuan 030024,China,National Key Laboratory for High-efficient Server and Storage Technology,Inspur,Jinan 250101,China,National Key Laboratory for High-efficient Server and Storage Technology,Inspur,Jinan 250101,China and National Key Laboratory for High-efficient Server and Storage Technology,Inspur,Jinan 250101,China
Abstract:Generalized minimal residual method (GMRES) is the most commonly used method for solving asymmetric large-scale linear algebraic equations,and it has fast convergence and stable property.Intel many integrated co-processors (MIC) has strong computing power and it can program easily.In this paper,MPI+OpenMP+offload hybrid programming paradigm was used to port GMRES algorithm to the MIC heterogeneous cluster platform.The perfor-mance of GMRES parallel algorithm was greatly improved by using kinds of optimization methods,such as hiding collective communications using asynchronous execution model,vectorization optimization,data transfer optimization,extensibility of MIC thread optimization,etc.Finally,GMRES parallel algorithm was used to improve the perfomance of solving high dimensional PDEs by the localized radical basis functions (RBFs) collocation methods.Results from tests indicate that the parallel efficiency can be up to 71.74% when using 32 processes in cluster,and the maximum speedup ratio of 4 MICs to 1 CPU can be up to 7.
Keywords:GMRES  MIC  MPI  Large-scale linear algebraic equations
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号