Accelerating sequential programs on commodity multi-core processors |
| |
Authors: | Yuanming Zhang Gang Xiao Takanobu Baba |
| |
Affiliation: | 1. College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China;2. Department of Information Science, Utsunomiya University, Utsunomiya, Japan |
| |
Abstract: | A recently proposed pipelined multithreading (PMT) technique exhibits wide applicability in parallelizing general sequential programs on multi-core processors. However, significant inter-core communication overhead limits PMT performance and prevents its commercial utilization. A simple and effective clustered pipelined multithreading (CPMT) approach is presented to accelerate sequential programs on commodity multi-core processors. This CPMT technique adopts a clustered communication mechanism that can yield very low average communication overhead by eliminating false sharing as well as reducing communication operation and transit delays in the software-only approach. A single-producer/single-consumer concurrent lock-free clusteredQueue algorithm based on a two-level queue structure is also proposed. The accuracy of CPMT is theoretically demonstrated. The performances of the algorithm and CPMT are evaluated on a commodity AMD Phenom four-core processor. The number of enqueue and dequeue times of the algorithm are 20.8 and 23 cycles given an appropriate parameter, respectively. The speedup of CPMT ranges from 13.1% to 119.8% for typical loops extracted from the SPEC CPU 2000 benchmark suite. |
| |
Keywords: | Commodity multi-core processors Pipeline parallelism Clustered communication mechanism |
本文献已被 ScienceDirect 等数据库收录! |