首页 | 本学科首页   官方微博 | 高级检索  
     

一种面向异构众核处理器的并行编译框架
引用本文:李雁冰,赵荣彩,韩林,赵捷,徐金龙,李颖颖.一种面向异构众核处理器的并行编译框架[J].软件学报,2019,30(4):981-1001.
作者姓名:李雁冰  赵荣彩  韩林  赵捷  徐金龙  李颖颖
作者单位:数学工程与先进计算国家重点实验室, 河南 郑州 450001,数学工程与先进计算国家重点实验室, 河南 郑州 450001,数学工程与先进计算国家重点实验室, 河南 郑州 450001,数学工程与先进计算国家重点实验室, 河南 郑州 450001,数学工程与先进计算国家重点实验室, 河南 郑州 450001,数学工程与先进计算国家重点实验室, 河南 郑州 450001
基金项目:国家自然科学基金(61702546);国家高技术研究发展计划(863)(2014AA01A300)
摘    要:异构众核处理器是面向高性能计算领域处理器发展的重要趋势,但其更为复杂的体系结构使得编程难的问题更加突出.针对这一问题,基于开源编译器Open64,提出了一种面向异构众核处理器的并行编译框架,将程序自动转换为异构并行程序.该框架主要包括4个模块:任务划分模块用来识别适合进行加速计算的程序段,实现了嵌套循环的多维并行识别方法;数据布局模块完成数据在主存和SPM之间的布局,实现了数组边界分析和指针范围分析;传输优化模块实现了数据传输合并、传输外提、打包传输、数组转置等多种数据传输优化方法;收益评估模块在构建代价模型的基础上实现了一种动静结合的收益评估方法.并且,基于SW26010处理器,对该编译框架进行了实现,测试结果表明,该编译框架能够实现一些程序以面向异构众核结构的并行变换,且获得较好的加速效果.

关 键 词:异构众核处理器  SW26010  并行编译  数据传输优化  OpenACC
收稿时间:2016/12/13 0:00:00
修稿时间:2017/1/23 0:00:00

Parallelizing Compilation Framework for Heterogeneous Many-core Processors
LI Yan-Bing,ZHAO Rong-Cai,HAN Lin,ZHAO Jie,XU Jin-Long and LI Ying-Ying.Parallelizing Compilation Framework for Heterogeneous Many-core Processors[J].Journal of Software,2019,30(4):981-1001.
Authors:LI Yan-Bing  ZHAO Rong-Cai  HAN Lin  ZHAO Jie  XU Jin-Long and LI Ying-Ying
Affiliation:State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China,State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China,State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China,State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China,State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China and State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China
Abstract:Heterogeneous many-core processors become an important trend in high-performance computing area, but the issue that the sophisticated architecture complicates the programming is more significantly. To solve this problem, this study proposes a parallelizing compilation framework for heterogeneous many-core processors based on the open source Open64 compiler, automating the transformation from a sequential program to heterogeneous parallel code. The framework mainly comprises a work scheduling module that identifies the parallelizable regions and achieves a multi-dimensional parallelization recognition for nested loops; a data mapping module that maps data between the main memory and SPM and realizes array boundary analysis and pointer range analysis; a transmission optimizing module that implements optimizations by merging, hoisting and packaging data transmission, and transposing array; and a performance estimation module that proposes a dynamic-static hybrid method to analyze benefit based on the cost model for SW26010. The compilation framework is implemented on top of Sunway SW26010 processors, and experimental evaluations are conducted on numerous benchmarks. The experimental results show that the proposed framework can parallelize these applications and obtain a promising performance improvement on heterogeneous many-core platforms.
Keywords:heterogeneous many-core processor  SW26010  parallelizing compilation  data transmission optimizing  OpenACC
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号