首页 | 本学科首页   官方微博 | 高级检索  
     

基于Intel MIC架构的3D有限差分算法优化
引用本文:郝鑫,郭绍忠.基于Intel MIC架构的3D有限差分算法优化[J].计算机科学,2017,44(5):26-32.
作者姓名:郝鑫  郭绍忠
作者单位:数学工程与先进计算国家重点实验室 郑州450002,数学工程与先进计算国家重点实验室 郑州450002
摘    要:有限差分算法是一种基于偏微分方程的数值离散方法,被广泛应用于弹性波传播问题的数值模拟中。该算法访存跨度大、计算密度高、CPU利用率低,这在实际应用中成为了性能瓶颈。针对上述问题,在详析3D有限差分算法(3DFD)的基础上,基于Intel MIC架构,采用三步递进法对其进行优化:首先,通过分支消除、循环展开、不变量外提等基本优化法削减计算强度并为向量化扫除障碍;然后,通过分析数据依赖及循环分块,使用向量指令集改写核心算法等并行优化法,充分利用MIC协处理器多线程、长向量的机制;最后,在异构众核平台(CPU+MIC:Many Integra-ted Cores)下通过数据传输最小化、负载均衡等异构协同优化法实现CPU和MIC的并行计算。实验验证,与原有算法相比,优化后的算法在异构平台上获得了50~120倍的加速。

关 键 词:有限差分算法  MIC架构  向量化  异构协同  并行计算
收稿时间:2016/4/30 0:00:00
修稿时间:2016/8/27 0:00:00

Optimization of 3D Finite Difference Algorithm on Intel MIC
HAO Xin and GUO Shao-zhong.Optimization of 3D Finite Difference Algorithm on Intel MIC[J].Computer Science,2017,44(5):26-32.
Authors:HAO Xin and GUO Shao-zhong
Affiliation:State Key Laboratory of Mathematics Engineering and Advanced Computing,Zhengzhou 450002,China and State Key Laboratory of Mathematics Engineering and Advanced Computing,Zhengzhou 450002,China
Abstract:Finite difference algorithm is a numerical discrete method based on the partial differential equation which is widely applied in elastic wave propagation simulation.Because of the high computation density,long distance memory access pattern and low CPU utilization,it becomes the performance bottleneck in practical applications.Aiming at solving above problems,this paper deliberated the key points of 3D finite difference(3DFD) algorithm and then proposed the three-step progressive method to optimize 3DFD algorithm based on Intel MIC.Firstly,the basic optimization methods,such as branch elimination,loop unroll,and invariant extraction,were proposed to reduce calculation strength and remove the obstacle of SIMD(Single Instruction Multiple Data).Secondly,by leveraging the parallel optimization methods such as data dependence analysis,loop tiling,and intrinsic SIMD instructions,it took full advantage of the mechanism of MIC coprocessor with multithreads and long vector.At last,the heterogeneous cooperative optimization methods,such as data transformation minimization and load balancing,were applied to the platform of CPU+MIC(Many Integrated Cores) which parallelizes the algorithm execution in both CPU and MIC.Experimental results show that the optimized 3DFD algorithm gains 50~120 speedup compared with original algorithm.
Keywords:Finite difference algorithm  MIC architecture  SIMD  Heterogeneous cooperation  Parallel computation
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号