首页 | 本学科首页   官方微博 | 高级检索  
     


A Case Study of Implementing Supernode Transformations
Authors:Johann Steinbrecher  Cesar J Philippidis  Weijia Shang
Affiliation:1. Computer Engineering Department, Santa Clara University, 500 El Camino Real, Santa Clara, CA, 95053, USA
Abstract:Supernode transformation is a technique to decrease the communication overhead by partitioning and scheduling a loop nest to a multi-processor system. This is achieved by grouping a number of iterations in a perfectly nested loop with regular dependences as a $supernode$ . Previous work has been focusing on finding the optimal supernode size and shape as well as an optimal execution schedule for multi-processor systems with unbounded resources. This paper emphasizes on the actual implementation strategies of supernode transformations on multi-core systems with limited resources. Using an example, the longest common subsequence (LCS) problem, we present and compare three different multithreading implementations. A formula for the total execution time of each method is presented. The techniques are benchmarked on a 12-core and a 4-core machine. On the 12-core machine our first technique, which yields increased data locality, speeds up the unaltered sequential loop nest 16.7 times. Combining this technique with skewing the loop by changing the linear schedule scores a 42.6 speedup. A more sophisticated method that executes entire rows of the loop nest in one thread scores a 59.5 speedup. Concepts presented and discussed in this paper on the LCS problem serve as basic foundation for implementations at regular dependence algorithms.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号