首页 | 本学科首页   官方微博 | 高级检索  
     

面向通讯同步的多处理器阵列重构
引用本文:吴亚兰,武继刚,姜文超,刘竹松.面向通讯同步的多处理器阵列重构[J].计算机科学,2017,44(7):47-56.
作者姓名:吴亚兰  武继刚  姜文超  刘竹松
作者单位:广东工业大学计算机学院 广州510006,广东工业大学计算机学院 广州510006,广东工业大学计算机学院 广州510006,广东工业大学计算机学院 广州510006
基金项目:本文受国家自然科学基金项目(61572144),广东省科技计划应用专项基金(2015B010129014),广东省自然科学基金项目(2016A030313703)资助
摘    要:从多处理器阵列中获取所需大小并且同步通讯性能优良的子阵列,是高性能拓扑重构的核心问题之一。基于不同的逻辑列剔除策略提出了3种面向通讯同步的拓扑重构算法:基于分治思想剔除逻辑列的重构算法(SCA_01),该算法能够使被优化的逻辑列相对均匀地分布在物理阵列中;优先剔除长逻辑列的贪心重构算法(SCA_02),该算法能够使被优化的逻辑列的长链接总数最少;基于分治与长链接数的混成重构算法(SCA_03),该算法将某一区域内的最长逻辑列剔除,且尽可能将剩余逻辑列均匀分布在物理阵列中。同时,对逻辑阵列的最大通讯延时给出了下界的求解算法。实验结果表明,3种算法在故障率小于1%、逻辑列的剔除率超过20%时,算法重构出的逻辑阵列的通讯延时特别接近计算出的性能下界。在多数情况下SCA_01优于SCA_02和SCA_03,而后两者的性能相近。在小阵列上且故障率与剔除率较小时,SCA_02具有性能优势,但在大阵列上SCA_03具有优势。在32×32的阵列上,SCA_01构造的阵列产生的通讯延时较SCA_02和SCA_03产生的延时平均减少25%,并且运行速度也提升了19.4%。

关 键 词:VLSI阵列  拓扑重构  容错  分治  算法
收稿时间:2016/8/20 0:00:00
修稿时间:2016/11/3 0:00:00

Reconfiguring Multiprocessor Arrays for Synchronous Communication
WU Ya-lan,WU Ji-gang,JIANG Wen-chao and LIU Zhu-song.Reconfiguring Multiprocessor Arrays for Synchronous Communication[J].Computer Science,2017,44(7):47-56.
Authors:WU Ya-lan  WU Ji-gang  JIANG Wen-chao and LIU Zhu-song
Affiliation:School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China,School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China,School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China and School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China
Abstract:Reconfiguring VLSI arrays to get a logical array with given size and synchronous communication is one of the key problems in reconfigurable topology for high performance computing.This paper presented three algorithms based on three different strategies of excluding logical co-lumns.The first algorithm,named SCA_01,can make the logical co-lumns in the uniform distribution in the host array,based on the divide-and-conquer for excluding logical columns.The second algorithm,named SCA_02,can minimize the number of the long interconnects of the logical array,based on the excluding the long logical column in priority.The third algorithm,named SCA_03,keeps the logical columns distributed in uniform way based on the hybrid strategy from excluding the long logical column and divide-and-conquer.In addition,this paper contributed an algorithm to calculate the lower bound of the communication delay for the given logical array.Simulation results show that,the communication delay of the logical array reconstructed by three algorithms is close to the lower bound proposed in this paper,when fault rate is less than 1% and the exclusion rate of logical columns is over 20%.Algorithm SCA_01 is superior to SCA_02 and SCA_03 in most cases,while SCA_02 and SCA_03 have nearly the same performance.But SCA_02 is better on smaller arrays and SCA_03 is better on large arrays,when the fault rate and exclusion rate are relatively small.The communication delay generated by SCA_01 is less than that of SCA_02 and SCA_03 by 25% on 32×32 host arrays.Moreover,SCA_01 is faster than the other two algorithms,and the running time is saved by 19.4%.It is concluded that SCA_01 is one of the relatively desirable algorithms to generate the logical arrays with minimum communication delay for high performance computing.
Keywords:VLSI array  Topology reconstruction  Fault-tolerance  Divide and conquer  Algorithm
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号