首页 | 本学科首页   官方微博 | 高级检索  
     

面向SW26010众核CPU的任务并行调度系统SWAN及其在若干嵌套并行算法中的应用
引用本文:孙乔,ZHANG Jia-Ji,黎雷生,赵海涛,赵慧,吴长茂. 面向SW26010众核CPU的任务并行调度系统SWAN及其在若干嵌套并行算法中的应用[J]. 软件学报, 2020, 31(7)
作者姓名:孙乔  ZHANG Jia-Ji  黎雷生  赵海涛  赵慧  吴长茂
作者单位:中国科学院软件研究所, 并行软件与计算科学实验室, 北京, 100190, 中国
基金项目:中国科学院战略性先导科技专项(C类)(XDC01030200);国家自然科学基金(61672508)
摘    要:任务并行是并行程序设计的基础设计模式.但由于算法本身的复杂性及目标平台的特殊性,设计实现高效率的任务并行程序对程序员来说往往充满挑战.基于新兴的SW26010众核CPU,本文提出支持任务嵌套并行模式的通用运行时框架SWAN.SWAN对任务并行程序的实现提供了高层次的抽象,使程序员能够专注于算法逻辑本身而提高开发效率.在性能方面,SWAN框架对诸多共享资源进行了细粒度的划分,从而有效地避免了众多线程间对共享资源的高强度争用.本文还充分利用平台的高速访存机制,高速可控缓存和原子操作等特性,对SWAN框架的核心数据结构进行优化设计以降低其本身的性能开销.另外,SWAN还具备动态负载均衡能力使得各个处理器核心的资源得以充分利用.本文基于SWAN框架在目标平台上实现了若干典型的具有递归特性的嵌套并行算法,包括N-皇后问题,二叉树遍历,快速排序和凸包求解.实验表明,这些通过使用SWAN框架得以并行化的算法相对其串行版本取得了4.5至32倍的加速,充分说明了SWAN框架具有较高的实用性及性能.

关 键 词:任务并行框架  并行计算  嵌套并行算法  SWAN  SW26010众核CPU
收稿时间:2019-08-22
修稿时间:2019-12-05

SWAN: A Task Parallel Framework and Its Application in Nested Parallel Algorithms on the SW26010 Many-core Platform
SUN Qiao,ZHANG Jia-Ji,LI Lei-Sheng,ZHAO Hai-Tao,ZHAO Hui,WU Chang-Mao. SWAN: A Task Parallel Framework and Its Application in Nested Parallel Algorithms on the SW26010 Many-core Platform[J]. Journal of Software, 2020, 31(7)
Authors:SUN Qiao  ZHANG Jia-Ji  LI Lei-Sheng  ZHAO Hai-Tao  ZHAO Hui  WU Chang-Mao
Affiliation:Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Task parallelism is one of the foundmental patterns for desgining parallel algorithms.Due to algorithm complexity and disctinctive hardware features, however, implementation of algorithms in task parallelism often remains to be chanllenging. On the newly SW26010 many-core CPU platform, a general runtime framework,SWAN,which supports nested task parallelism is proposed in this paper.SWAN provides high-level abstractions for programmers to implement task parallelism so that they can focus mainly on the algorithm itself, enjoying an enhanced productivity.In the aspect of performance, the shared resources and information manipulated by SWAN are partitioned in a fine-grained manner to avoid fierce contention among working threads.The core data structures within SWAN take advantage of the high-bandwidth memory access mechanism,fast on-chip scratchpad cache as well as atomic operations of the platform to reduce the overhead of SWAN itself. Besides, SWAN provides dynamic load-balancing strategies in runtime to ensure a full occupation of the threads. In the experiment, a set of recursive algorithms in nested parallelism, including the N-queens problem, binary-tree traversal,quick sort and convex hull,are implemented using SWAN on the target platform. The experimental results reveal that each of the algorithms can gain a significant speedup,from 4.5x to 32x,against its serial counterpart, which suggests that SWAN has a high usability and performance.
Keywords:Task Parallel Framework  Parallel Computing  Nested Parallelism  SWAN  SW26010 Many-core CPU
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号