A coarse-grained reconfigurable computing architecture with loop self-pipelining |
| |
Authors: | Yong Dou GuiMing Wu JinHui Xu XingMing Zhou |
| |
Affiliation: | (1) National Laboratory for Parallel & Distributed Processing, National University of Defense Technology, Changsha, 410073, China |
| |
Abstract: | Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose
computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes
a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation
techniques to support decoupling synchronization between the token generator and the collector. This paper also introduces
the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special
data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average
3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture
achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.
Supported by the National Natural Science Foundation of China (Grant No. 60633050, 60621003) and the National High Technology
Research and Development Program of China (Grant No. 2007AA01Z06) |
| |
Keywords: | reconfigurable computing loop pipelining data driven register promotion |
本文献已被 CNKI 维普 SpringerLink 等数据库收录! |
|