首页 | 本学科首页   官方微博 | 高级检索  
     


A run-time optimization approach for reducing data movements using locality-aware searching
Authors:Liang Li  Endong Wang  Xingjun Zhang  Kang Yan  Tao Ju  Xiaoshe Dong
Affiliation:1. Xi’an Jiaotong University, Xi’an, China
2. The State Key Laboratory of High-end Server and Storage Technology, Jinan, China
Abstract:The CPU–GPU communication bottleneck limits the performance improvement of GPU applications in heterogeneous GPGPU systems and usually is handled by data reuse optimization. This paper analyzes data reuse through DAG abstraction and obtains rules showing that the run-time data reuse optimization can effectively relieve the bottleneck. Based on the rules, this paper proposes a run-time optimization framework for data reuse, called R-Tracker. The R-Tracker uses locality-aware searching approach to handle reuses. It can not only low costly implement the data reuse optimization but also effectively implement the searching, the data transfers, and the GPU computation concurrently. R-Tracker relaxes the constraints that are required in compiler-based approaches and thus achieves better reuse effect. The experimental results show that R-Tracker improves the performance by 1.77–16.42 % over compiler-based approach OpenMPC and 1.40–8.39 % over CGCM in single-node execution, and 48.78–60 % over CGCM in multi-node execution.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号