共查询到10条相似文献,搜索用时 78 毫秒
1.
2.
Ramanujam J. Sadayappan P. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):472-482
A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented 相似文献
3.
We investigate the lattice-based array partitioning based on the theory of the Smith Normal Form and we present two elegant techniques for partitioning arrays in parallel DoAll loops for message-passing parallel machines: (1) DoAll loops with constant dependencies for communication-free partitioning: a general solution of all possible communication-free partitioning is derived where the dependencies among array references are described in constant distance vectors. (2) DoAll loops with non-constant dependencies for block-communication partitioning: the dependencies among array references are described in non-constant distance vectors. We derive the partitioning equations which allocate all remote data to a unique processor such that only one block-communication can obtain all the remote data for the computation. By using the Smith Normal Form decomposition, we are also able to verify our partitioning results. 相似文献
4.
Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers 总被引:3,自引:3,他引:0
This paper addresses the problem of communication-free partition of iteration spaces and data spaces along hyperplanes. To finding more possible communication-free hyperplane partitions, we treat statements within a loop body as separate schedulable units. Instead of using the information about data dependence distance or direction vectors, our technique explicitly formulates array references as transformations from statement-iteration spaces to data spaces. Based on these transformations, the necessary and sufficient conditions for communication-free partition along hyperplanes to be feasible have been proposed. This approach can be applied to all programs with an imperfectly nested loop or sequences of imperfectly nested loops, whose array references are affine functions of outer loop indices or loop invariant variables. The proposed approach is more practical than existing methods in finding the data and computation distribution patterns that can cause the processor to execute fully-parallel on multicomputers without any interprocessor communication. 相似文献
5.
Iain Phillips Maria Grazia Vigliotti 《Electronic Notes in Theoretical Computer Science》2005,128(2):185
Palamidessi has shown that the π-calculus with mixed choice is powerful enough to solve the leader election problem on a symmetric ring of processes. We show that this is also possible in the calculus of Mobile Ambients (MA), without using communication or restriction. Following Palamidessi's methods, we deduce that there is no encoding satisfying certain conditions from MA into CCS. We also show that the calculus of Boxed Ambients is more expressive than its communication-free fragment. 相似文献
6.
Tzung-Shi Chen Jang-Ping Sheu 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(9):924-938
In distributed memory multicomputers, local memory accesses are much faster than those involving interprocessor communication. For the sake of reducing or even eliminating the interprocessor communication, the array elements in programs must be carefully distributed to local memory of processors for parallel execution. We devote our efforts to the techniques of allocating array elements of nested loops onto multicomputers in a communication-free fashion for parallelizing compilers. We first analyze the pattern of references among all arrays referenced by a nested loop, and then partition the iteration space into blocks without interblock communication. The arrays can be partitioned under the communication-free criteria with nonduplicate or duplicate data. Finally, a heuristic method for mapping the partitioned array elements and iterations onto the fixed-size multicomputers under the consideration of load balancing is proposed. Based on these methods, the nested loops can execute without any communication overhead on the distributed memory multicomputers. Moreover, the performance of the strategies with nonduplicate and duplicate data for matrix multiplication is studied 相似文献
7.
S?awomir Lasota 《Information Processing Letters》2009,109(15):850-855
We investigate the simulation preorder between finite-state systems and a simple subclass of BPP-nets (communication-free nets). We show EXPSPACE lower bounds for the simulation problems, in both directions, as well as for the simulation equivalence. Our results improve PSPACE and co-NP lower bounds for the simulation between finite-state systems and BPP-nets, given by Ku?era and Mayr in [A. Ku?era, R. Mayr, Simulation preorder over simple process algebras, Information and Computation 173 (2) (2002) 184-198]. 相似文献
8.
9.
10.
Shao Shuyi Jones Alex K. Melhem Rami 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(3):331-345
In this paper we explore compiler techniques for achieving efficient communications on circuit switching interconnection networks. We propose a compilation framework for identifying communication patterns and compiling these patterns as network configuration directives. This has the potential of providing significant performance benefits when connections can be established in the network prior to the actual communications. The framework includes a flexible and powerful communication pattern representation scheme that captures the property of communication patterns and allows manipulation of these patterns. In this way, communication phases can be identified within the application. Additionally, we extend the classification of static and dynamic communications to include persistent communications. Persistent communications are a subclass of dynamic communications that remain unchanged for large segments of the application execution. An experimental compiler has been developed to implement the framework. This compiler is capable of detecting both static and persistent communications within an application. We show that for the NAS Parallel Benchmarks, 100% of the point-to-point communications can be classified as either static or persistent and 100% of the collectives are either static or persistent with the exception of IS. Simulation-based performance analysis demonstrates the benefit of using our compiler techniques for achieving efficient communications in multiprocessor systems. 相似文献