期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

陈莉张兆庆冯晓兵《计算机学报》2003,26(2):180-187

针对并行循环套序列，提出一种冗余计算分割的通信优化方法，根据数据流分析，文中给出用以确定每个循环套的冗余计算量的一般方法，并在此基础上提出冗余计算分割的实现和判定，针对规则依赖的程序，该文还提出了一个高效的冗余计算分割的实现方法，该技术已经在一个并行编译器中实现，试验结果表明，它比传统的通信优化技术有明显的优越性。相似文献

2.

Compile-time techniques for data distribution in distributed memorymachines

Ramanujam J. Sadayappan P. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):472-482

A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented 相似文献

3.

Automatic Array Partitioning Based on the Smith Normal Form

Eric Hung-Yu?Tseng Jean-Luc?Gaudiot Email author 《International journal of parallel programming》2005,33(1):35-56

We investigate the lattice-based array partitioning based on the theory of the Smith Normal Form and we present two elegant techniques for partitioning arrays in parallel DoAll loops for message-passing parallel machines: (1) DoAll loops with constant dependencies for communication-free partitioning: a general solution of all possible communication-free partitioning is derived where the dependencies among array references are described in constant distance vectors. (2) DoAll loops with non-constant dependencies for block-communication partitioning: the dependencies among array references are described in non-constant distance vectors. We derive the partitioning equations which allocate all remote data to a unique processor such that only one block-communication can obtain all the remote data for the computation. By using the Smith Normal Form decomposition, we are also able to verify our partitioning results. 相似文献

4.

Statement-Level Communication-Free Partitioning Techniques for Parallelizing Compilers 总被引：3，自引：3，他引：0

Shih Kuei-Ping Sheu Jang-Ping Huang Chua-Huang 《The Journal of supercomputing》2000,15(3):243-269

This paper addresses the problem of communication-free partition of iteration spaces and data spaces along hyperplanes. To finding more possible communication-free hyperplane partitions, we treat statements within a loop body as separate schedulable units. Instead of using the information about data dependence distance or direction vectors, our technique explicitly formulates array references as transformations from statement-iteration spaces to data spaces. Based on these transformations, the necessary and sufficient conditions for communication-free partition along hyperplanes to be feasible have been proposed. This approach can be applied to all programs with an imperfectly nested loop or sequences of imperfectly nested loops, whose array references are affine functions of outer loop indices or loop invariant variables. The proposed approach is more practical than existing methods in finding the data and computation distribution patterns that can cause the processor to execute fully-parallel on multicomputers without any interprocessor communication. 相似文献

5.

Leader Election in Rings of Ambient Processes

Iain Phillips Maria Grazia Vigliotti 《Electronic Notes in Theoretical Computer Science》2005,128(2):185

Palamidessi has shown that the π-calculus with mixed choice is powerful enough to solve the leader election problem on a symmetric ring of processes. We show that this is also possible in the calculus of Mobile Ambients (MA), without using communication or restriction. Following Palamidessi's methods, we deduce that there is no encoding satisfying certain conditions from MA into CCS. We also show that the calculus of Boxed Ambients is more expressive than its communication-free fragment. 相似文献

6.

Communication-free data allocation techniques for parallelizingcompilers on multicomputers

Tzung-Shi Chen Jang-Ping Sheu 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(9):924-938

In distributed memory multicomputers, local memory accesses are much faster than those involving interprocessor communication. For the sake of reducing or even eliminating the interprocessor communication, the array elements in programs must be carefully distributed to local memory of processors for parallel execution. We devote our efforts to the techniques of allocating array elements of nested loops onto multicomputers in a communication-free fashion for parallelizing compilers. We first analyze the pattern of references among all arrays referenced by a nested loop, and then partition the iteration space into blocks without interblock communication. The arrays can be partitioned under the communication-free criteria with nonduplicate or duplicate data. Finally, a heuristic method for mapping the partitioned array elements and iterations onto the fixed-size multicomputers under the consideration of load balancing is proposed. Based on these methods, the nested loops can execute without any communication overhead on the distributed memory multicomputers. Moreover, the performance of the strategies with nonduplicate and duplicate data for matrix multiplication is studied 相似文献

7.

EXPSPACE lower bounds for the simulation preorder between a communication-free Petri net and a finite-state system

S?awomir Lasota 《Information Processing Letters》2009,109(15):850-855

We investigate the simulation preorder between finite-state systems and a simple subclass of BPP-nets (communication-free nets). We show EXPSPACE lower bounds for the simulation problems, in both directions, as well as for the simulation equivalence. Our results improve PSPACE and co-NP lower bounds for the simulation between finite-state systems and BPP-nets, given by Ku?era and Mayr in [A. Ku?era, R. Mayr, Simulation preorder over simple process algebras, Information and Computation 173 (2) (2002) 184-198]. 相似文献

8.

Communication-Free Alignment for Array References with Linear Subscripts in Three Loop Index Variables or Quadratic Subscripts

Chang Weng-Long Chu Chih-Ping Wu Jia-Hwa 《The Journal of supercomputing》2001,20(1):67-83

相似文献

9.

Using elementary linear algebra to solve data alignment for arrays with linear or quadratic references

Weng-Long Chang Jih-Woei Huang Chih-Ping Chu 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(1):28-39

相似文献

10.

Compiler Techniques for Efficient Communications in Circuit Switched Networks for Multiprocessor Systems

Shao Shuyi Jones Alex K. Melhem Rami 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(3):331-345

In this paper we explore compiler techniques for achieving efficient communications on circuit switching interconnection networks. We propose a compilation framework for identifying communication patterns and compiling these patterns as network configuration directives. This has the potential of providing significant performance benefits when connections can be established in the network prior to the actual communications. The framework includes a flexible and powerful communication pattern representation scheme that captures the property of communication patterns and allows manipulation of these patterns. In this way, communication phases can be identified within the application. Additionally, we extend the classification of static and dynamic communications to include persistent communications. Persistent communications are a subclass of dynamic communications that remain unchanged for large segments of the application execution. An experimental compiler has been developed to implement the framework. This compiler is capable of detecting both static and persistent communications within an application. We show that for the NAS Parallel Benchmarks, 100% of the point-to-point communications can be classified as either static or persistent and 100% of the collectives are either static or persistent with the exception of IS. Simulation-based performance analysis demonstrates the benefit of using our compiler techniques for achieving efficient communications in multiprocessor systems. 相似文献