首页 | 本学科首页   官方微博 | 高级检索  
     

面向众核系统的层次化栅栏同步机制
引用本文:臧照虎,李晨,王耀华,陈小文,郭阳.面向众核系统的层次化栅栏同步机制[J].计算机工程与科学,2022,44(11):1901-1908.
作者姓名:臧照虎  李晨  王耀华  陈小文  郭阳
作者单位:(国防科技大学计算机学院,湖南 长沙 410073)
基金项目:国防科技大学科研计划(ZK20-04)
摘    要:同步操作在保证多核处理器线程的数据一致性和正确性等方面起着重要作用。随着处理器内核数量的不断增加,同步操作的开销也越来越大。栅栏同步是并行应用中多核同步的重要方法之一。软件同步方法通常需要数千个周期才能完成多个内核之间的同步,这种高延迟和串行化同步会导致多核程序性能的显著下降。相比于软件栅栏同步方法,硬件栅栏能够实现较低的同步延迟,然而传统集中式硬件栅栏的可扩展性有限,难以适应众核处理器系统的同步需求。面向众核处理器提出了一种层次化硬件栅栏机制——HSync,它由本地栅栏单元和全局栅栏单元组成,二者协调配合,以实现低硬件开销的快速同步。实验结果表明,与传统的集中式硬件栅栏相比,层次化硬件栅栏机制将众核处理器系统性能提高了1.13倍,同时网络流量减少了74%。

关 键 词:硬件同步  栅栏  众核系统  并行计算  
收稿时间:2021-11-22
修稿时间:2022-03-22

A hierarchical hardware barriersynchronization design for many-core processors
ZANG Zhao-hu,LI Chen,WANG Yao-hua,CHEN Xiao-wen,GUO Yang.A hierarchical hardware barriersynchronization design for many-core processors[J].Computer Engineering & Science,2022,44(11):1901-1908.
Authors:ZANG Zhao-hu  LI Chen  WANG Yao-hua  CHEN Xiao-wen  GUO Yang
Affiliation:(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)
Abstract:Synchronization plays an important role in ensuring data consistency and correctness of multicore processor threads. As the number of processor cores increases, the cost of synchronization increases. Barrier synchro-nization is one of the effective methods for multi-core synchronization in parallel applications. Software synchronization methods typically require thousands of cycles to complete synchronization among multiple cores. This high latency and serialization synchronization can result in significant performance degradation of multicore programs. Compared with the software barrier synchronization method, the hardware barrier can achieve lower synchronization delay, but the scalability of the centralized hardware barrier is limited and it is difficult to adapt to the multicore processor systems. This paper proposes a hierarchical hardware barrier mechanism called HSync for multicore processors. It consists of local and global barrier units, which work together to achieve fast synchronization with low hardware overhead. The experimental results show that the hierarchical hardware barrier mechanism improves the performance of the multicore proces-sor system by 1.13 times and reduces network traffic by 74% compared with the traditional centralized hardware barrier.
Keywords:hardware synchronization  barrier  many-core processors  parallel computing  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号