首页 | 本学科首页   官方微博 | 高级检索  
     


Improved GPU SIMD control flow efficiency via hybrid warp size mechanism
Affiliation:1. Research Scholar, Department of Electronics & Communication Engineering, SRM Institute of Science and Technology, Chennai, India;2. Associate Professor, Department of Electronics & Communication Engineering, SRM Institute of Science and Technology, Chennai, India;3. Assistant Professor, Department of Telecommunication Engineering, SRM Institute of Science and Technology, Chennai, India
Abstract:High single instruction multiple data (SIMD) efficiency and low power consumption have made graphic processing units (GPUs) an ideal platform for many complex computational applications. Thousands of threads can be created by programmers and grouped into fixed-size SIMD batches, known as warps. High throughput is then achieved by concurrently executing such warps with minimal control overhead. However, if a branch instruction occurs, which assigns different paths to different threads, one warp will be broken into multiple warps that have to be executed serially, consequently reducing the efficiency advantage of SIMD. In this paper, the contemporary fixed-size warp design is abandoned for a hybrid warp size (HWS) mechanism. Mixed-size warps are generated according to HWS and are scheduled and issued flexibly. The simulation results show that this mechanism yields an average speedup of 1.20 over the baseline architecture for a wide variety of general purpose GPU applications. The paper also integrates HWS with dynamic warp formation (DWF), which is a well-known branch handling mechanism used to improve SIMD utilization by forming new warps out of split warps in real time. The simulation results show that the combination of DWF and HWS generates an average speedup of 1.27 over the DWF-only platform with an estimated area increase of about 1% of DWF.
Keywords:SIMD  GPU  Warp  Branch divergence
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号