首页 | 本学科首页   官方微博 | 高级检索  
     

SpMV计算的ARM和FPGA异构加速器设计
引用本文:朱明达,薛济擎,艾纯瑶.SpMV计算的ARM和FPGA异构加速器设计[J].电讯技术,2024,64(2):302-309.
作者姓名:朱明达  薛济擎  艾纯瑶
作者单位:中国石油大学北京 信息科学与工程学院,北京 102249
摘    要:针对稀疏矩阵向量乘(Sparse Matrix-Vector Multiplication, SpMV)在边缘端实施效率不高的问题,以稀疏矩阵的存储格式、SpMV的现场可编程门阵列(Field Programmable Gate Array, FPGA)加速为研究对象,提出了一种多端口改进的行压缩存储格式(Modified Compressed Sparse Row Format, MCSR)与ARM+FPGA架构任务级数据级硬件优化相结合的加速方法。使用多个端口并行存取数据来提高计算并行度;使用数据流、循环流水实现循环间、循环内的并行加速;使用数组分割、流传输实现数据的细粒度并行缓存与计算;使用ARM+FPGA架构,ARM完成对系统的控制,将计算卸载到FPGA并行加速。实验结果表明,并行加速优化后的ARM+FPGA方案相较于单ARM方案最高可达10倍的加速效果,而且增加的资源消耗在可接受范围内,矩阵规模越大非零值越多加速效果越明显。研究成果在边缘端实施SpMV计算方面有一定实用价值。

关 键 词:稀疏矩阵向量乘(SpMV)  异构加速器  硬件加速

Design of an ARM and FPGA Heterogeneous Accelerator for SpMV Computation
ZHU Mingd,XUE Jiqing,AI Chunyao.Design of an ARM and FPGA Heterogeneous Accelerator for SpMV Computation[J].Telecommunication Engineering,2024,64(2):302-309.
Authors:ZHU Mingd  XUE Jiqing  AI Chunyao
Affiliation:College of Information Science and Engineering,China University of Petroleum,Beijing 102249,China
Abstract:To address the problem of inefficient implementation of sparse matrix-vector multiplication(SpMV) at the edge, the authors study the storage format of sparse matrix and field programmable gate array(FPGA) acceleration method of SpMV and propose a multi-port modified compressed row format(MCSR) acceleration method combined with task-level data-level hardware optimization in ARM+FPGA architecture.Computational parallelism is improved by using multiple ports to access data in parallel.Parallel acceleration between and within loops is achieved using dataflow and pipeline.Fine-grained parallel caching and computation of data is achieved using array partition and stream transfer.The ARM+FPGA architecture is used,with ARM completing the control of the system and offloading the computation to the FPGA for parallel acceleration.Experimental results show that the parallel acceleration optimized ARM+FPGA scheme can achieve up to 10 times acceleration compared with the single ARM scheme.And the increased resource consumption is within the acceptable range.The results also show that the larger the matrix size,the more non-zero value,the more obvious the acceleration effect.The research results are of practical value in the implementation of SpMV computing at the edge.
Keywords:sparse matrix-vector multiplication(SpMV)  heterogeneous accelerator  hardware acceleration
点击此处可从《电讯技术》浏览原始摘要信息
点击此处可从《电讯技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号