面向异构计算机平台的HPL方案 HPL Approach for Heterogeneous Computer Platforms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向异构计算机平台的HPL方案

引用本文：	孙乔,孙家昶,马文静,赵玉文.面向异构计算机平台的HPL方案[J].软件学报,2021,32(8):2329-2340.

作者姓名：	孙乔孙家昶马文静赵玉文

作者单位：	中国科学院软件研究所并行软件与计算科学实验室,北京 100190;中国科学院软件研究所并行软件与计算科学实验室,北京 100190;计算机科学国家重点实验室(中国科学院软件研究所),北京 100190;中国科学院软件研究所并行软件与计算科学实验室,北京 100190;中国科学院大学,北京 100049

基金项目：	国家重点研发计划（2018YFB0204404）；中国科学院战略性先导科技专项（C类）（XDC01030200）

摘要：	HPL(high performance Linpack)是一套被广泛用于测评计算机性能的测试程序,几十年来学术界及产业界十分关注对HPL测试程序的定制化优化工作,以充分反应同时代新兴计算机平台的性能.面向当今主流多设备异构计算平台,尝试为HPL的优化工作提供一种解决方案:Hetero-HPL.在Hetero-HPL中...
关键词：	HPL(high performance Linpack) 多设备异构平台并行计算
收稿时间：	2019/8/22 0:00:00
修稿时间：	2019/12/5 0:00:00
HPL Approach for Heterogeneous Computer Platforms

SUN Qiao,SUN Jia-Chang,MA Wen-Jing,ZHAO Yu-Wen.HPL Approach for Heterogeneous Computer Platforms[J].Journal of Software,2021,32(8):2329-2340.

Authors:	SUN Qiao SUN Jia-Chang MA Wen-Jing ZHAO Yu-Wen

Affiliation:	Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science(Institute of Software, Chinese Academy of Sciences), Beijing 100190, China; Laboratory of Parallel Software and Computational Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China

Abstract:	HPL (high performance Linpack) is a widely used benchmark for measuring computer performance. Over the decades, the practice of optimizing and tuning of HPL has constantly drawn great attention in both industrial and academic circle, to evaluate the performance of contemporary cutting-edge computer platforms. For current heterogeneous HPC platforms with multiple accelerating co-processors, an approach of high-performance HPL benchmark, Hetero-HPL, is proposed in this paper. In Hetero-HPL, the mapping between process set and (co-) processor set becomes adjustable, so that the computation within each computing node may avoid inter-process message exchange, and each important procedure of the HPL algorithm may make full use of the hardware resources of the computing node, such as memory, CPU cores, co-processors, and PCI-e bus etc.Without redundant computation and communication, the working set of Hetero-HPL is not restricted by the limit of pinned memory size in a single allocation, and is distributed in a way that the workload is balanced among all the co-processors and massive fine-grained parallelism can be exploited. On one experimental platform with four co-processors, Heter-HPL can reach an efficiency of 76.5% (the efficiency of function dgemm is 84%) in one computing node, and further experiment suggests that Hetero-HPL is also a feasible approach in distributed environment.

Keywords:	HPL (high performance Linpack) multi-device heterogeneous platform parallel computing
本文献已被万方数据等数据库收录！
	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏