首页 | 本学科首页   官方微博 | 高级检索  
     


Post-pass partitioning of signal processing programs
Authors:Chris J. Newburn  John Paul Shen
Affiliation:(1) Department of Electrical and Computer Engineering, Carnegie Mellon University, USA
Abstract:
Symmetric multiprocessor systems are increasingly common, not only as high-throughput servers, but as a vehicle for executing a single application in parallel in order to reduce its execution latency. This article presents Pedigree, a compilation tool that employs a new partitioning heuristic based on the program dependence graph (PDG). Pedigree creates overlapping, potentially interdependent threads, each executing on a subset of the SMP processors that matches the thread’s available parallelism. A unified framework is used to build threads from procedures, loop nests, loop iterations, and smaller constructs. Pedigree does not require any parallel language support; it is post-compilation tool that reads in object code. The SDIO Signal and Data Processing Benchmark Suite has been selected as an example of real-time, latency-sensitive code. Its coarse-grained data flow parallelism is naturally exploited by Pedigree to achieve speedups of 1.63×/2.13× (mean/max) and 1.71×/2.41× on two and four processors, respectively. There is roughly a 20% improvement over existing techniques that exploit only data parallelism. By exploiting the unidirectional flow of data for coarse-grained pipelining, the synchronization overhead is typically limited to less than 6% for synchronization latency of 100 cycles, and less than 2% for 10 cycles. This research was supported by ONR contract numbers N00014-91-J-1518 and N00014-96-1-0347. We would like to thank the Pittsburgh Supercomputing Center for use of their Alpha systems.
Keywords:Post-pass  partitioning  threading  multiprocessing  compiler  PDG  retargetable  Pedigree
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号