首页 | 本学科首页   官方微博 | 高级检索  
     

GPU矩阵乘法的性能定量分析模型
引用本文:尹孟嘉,许先斌,熊曾刚,张 涛. GPU矩阵乘法的性能定量分析模型[J]. 计算机科学, 2015, 42(12): 13-17, 22
作者姓名:尹孟嘉  许先斌  熊曾刚  张 涛
作者单位:武汉大学计算机学院 武汉430072;湖北工程学院计算机与信息科学学院 孝感432000,武汉大学计算机学院 武汉430072,湖北工程学院计算机与信息科学学院 孝感432000,湖北工程学院计算机与信息科学学院 孝感432000
基金项目:本文受国家自然科学基金(61370092),湖北省自然科学基金(2013CFC005),湖北省中青年创新团队(T201410)资助
摘    要:性能评价和优化是设计高效率并行程序必不可少的重要工作,存储系统的性能高低直接影响到处理器的整体性能。利用GPGPU-Sim对GPU的存储层次结构进行了模拟,找出了SM数量与存储控制器数量之间最佳配置关系。矩阵乘法是科学计算领域中的基本组成部分,是一种具有计算和访存密集特点的典型应用,其性能是GPU高性能计算的一个重要指标。性能模型作为并行系统性能评价的新的技术解决方案,具有许多其它性能评价方法无法比拟的优势。建立了一个性能模型,模型通过对指令流水线、共享存储器访存、全局存储器访存进行定量分析,找到了程序运行瓶颈,提高了执行速度。实验证明,该模型具有实用性,并有效地实现了矩阵乘法的优化。

关 键 词:GPU  GPGPU-Sim  矩阵乘法  性能定量分析模型  指令流水线  共享存储器访存  全局存储器访存
收稿时间:2015-01-29
修稿时间:2015-03-15

Quantitative Performance Analysis Model of Matrix Multiplication Based on GPU
YIN Meng-ji,XU Xian-bin,XIONG Zeng-gang and ZHANG Tao. Quantitative Performance Analysis Model of Matrix Multiplication Based on GPU[J]. Computer Science, 2015, 42(12): 13-17, 22
Authors:YIN Meng-ji  XU Xian-bin  XIONG Zeng-gang  ZHANG Tao
Affiliation:School of Computer,Wuhan University,Wuhan 430072,China;School of Computer and Information Science,Hubei Engineering University,Xiaogan 432000,China,School of Computer,Wuhan University,Wuhan 430072,China,School of Computer and Information Science,Hubei Engineering University,Xiaogan 432000,China and School of Computer and Information Science,Hubei Engineering University,Xiaogan 432000,China
Abstract:Performance evaluation and optimization are indispensable work when designing efficient parallel program,and the performance of storage system directly affects the performance of the processor.We used GPGPU-Sim to simulate the storage hierarchy of GPU,and found out optimal quantity allocation relationship between SM and storage controller in GPU.Matrix multiplication is an essential part in the field of scientific computing,as a representative application with both computation and memory access intensiveness,and its performance is an important indicator of GPU high-performance computing.Performance model is a new technology solution as parallel systems performance evaluation,which has many advantages.In order to improve the performance of matrix multiplication,this paper proposed a quantitative performance model based on GPU.The model quantitatively analyzes instruction pipeline,shared memory access and global memory access,establishes the performance model,finds the performance bottlenecks and improves the execution speed.The experiment proves the model has practicability,and effectively realizes the optimization of the matrix multiplication algorithm.
Keywords:GPU  GPGPU-Sim  Matrix multiplication  Quantitative performance analysis model  Instruction pipeline  Shared memory access  Global memory access
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号