基于MPSoC并行调度的矩阵乘法加速算法研究 Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于MPSoC并行调度的矩阵乘法加速算法研究

引用本文：	杨飞,马昱春,侯金,徐宁.基于MPSoC并行调度的矩阵乘法加速算法研究[J].计算机科学,2017,44(8):36-41.

作者姓名：	杨飞马昱春侯金徐宁

作者单位：	中南民族大学智能无线通信湖北省重点实验室武汉430074;清华大学计算机科学与技术系北京100084,清华大学计算机科学与技术系北京100084,中南民族大学智能无线通信湖北省重点实验室武汉430074,武汉理工大学交通物联网技术湖北省重点实验室武汉430074

基金项目：	本文受European Union Seventh Framework Programme(318521),国家自然科学基金面上项目(61076035)资助

摘要：	矩阵乘法是数值分析以及图形图像处理算法的基础,通用的矩阵乘法加速器设计一直是嵌入式系统设计的研究热点。但矩阵乘法由于计算复杂度高,处理效率低,常常成为嵌入式系统运算速度的瓶颈。为了在嵌入式领域更好地使用矩阵乘法,提出了基于MPSoC(MultiProcessor System-on-Chip)的软硬件协同加速的架构。在MPSoC的架构下,一方面,设计了面向硬件约束的矩阵分块方法,从而实现了通用的矩阵乘法加速器系统；另一方面,通过利用MPSoC下的多核架构,提出了相应的任务划分和负载平衡调度算法,提高了并行效率和整体系统加速比。实验结果表明,所提架构及算法实现了通用的矩阵乘法计算,并且通过软硬件协同设计实现的多核并行调度算法与传统单核设计相比在计算效率方面得到了显著的提高。
关键词：	矩阵乘法 MPSoC 并行计算负载平衡
收稿时间：	2016/7/30 0:00:00
修稿时间：	2016/11/10 0:00:00
Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC

YANG Fei,MA Yu-chun,HOU Jin and XU Ning.Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC[J].Computer Science,2017,44(8):36-41.

Authors:	YANG Fei MA Yu-chun HOU Jin and XU Ning

Affiliation:	Hubei Key Laboratory of Intelligent Wireless Communications,South-central University for Nationalities,Wuhan 430074,China;Department of Computer Science & Technology,Tsinghua University,Beijing 100084,China,Department of Computer Science & Technology,Tsinghua University,Beijing 100084,China,Hubei Key Laboratory of Intelligent Wireless Communications,South-central University for Nationalities,Wuhan 430074,China and Hubei Key Laboratory of Transportation Internet of Things,Wuhan University of Technology,Wuhan 430074,China

Abstract:	Matrix multiplication is the basic algorithm of the numerical analysis,graphics and image processing.General matrix multiplication accelerator has always been a research focus in the embedded system design.However,due to the high complexity and the low processing efficiency,matrix multiplication becomes the bottleneck of computation speed of embedded systems.In order to use matrix multiplication in the embedded field,a synergy acceleration architecture of software and hardware based on MPSoC was proposed in this paper.With MPSoC architecture,the partitioning of the matrix considering hardware constraints is implemented in our HW/SW system to enable the computation of general matrix multiplications.The parallel computation with multiple cores and hardware function unit is realized with the load balance algorithms.Parallel efficiency and speed-up ratio are improved.The experimental results show that the proposed general matrix multiplication approach can achieve significant speed-up over the traditional approaches with single core.

Keywords:	Matrix multiplication MPSoC Parallel computation Load balance

	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏