期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Operation of an experimental facility for fabrication of fuel elements and fuel assemblies of the bor-60 containing vibrocompacted fuel

R. Bunk U. Leske R. Krompass Z. Pretch K. Rudolf R. Herbig K. Pitch V. A. Tsykanov O. V. Skiba V. A. Makarov L. P. Bol'shakov P. T. Porodnov A. A. Maershin S. S. Keruchen'ko 《Atomic Energy》1989,67(5):802-806

Translated from Atomnaya Énergiya, Vol. 67, No. 5, pp. 320–323, November, 1989. 相似文献

2.

Techniques for pipelined broadcast on ethernet switched clusters

Pitch Patarasuk Xin Yuan Ahmad Faraj 《Journal of Parallel and Distributed Computing》2008

By splitting a large broadcast message into segments and broadcasting the segments in a pipelined fashion, pipelined broadcast can achieve high performance in many systems. In this paper, we investigate techniques for efficient pipelined broadcast on clusters connected by multiple Ethernet switches. Specifically, we develop algorithms for computing various contention-free broadcast trees that are suitable for pipelined broadcast on Ethernet switched clusters, extend the parametrized LogP model for predicting appropriate segment sizes for pipelined broadcast, show that the segment sizes computed based on the model yield high performance, and evaluate various pipelined broadcast schemes through experimentation on Ethernet switched clusters with various topologies. The results demonstrate that our techniques are practical and efficient for contemporary fast Ethernet and Giga-bit Ethernet clusters. 相似文献

3.

Bandwidth Efficient All-to-All Broadcast on Switched Clusters

Ahmad Faraj Pitch Patarasuk Xin Yuan 《International journal of parallel programming》2008,36(4):426-453

Clusters of workstations employ flexible topologies: regular, irregular, and hierarchical topologies have been used in such systems. The flexibility poses challenges for developing efficient collective communication algorithms since the network topology can potentially have a strong impact on the communication performance. In this paper, we consider the all-to-all broadcast operation on clusters with cut-through and store-and-forward switches. We show that near-optimal all-to-all broadcast on a cluster with any topology can be achieved by only using the links in a spanning tree of the topology when the message size is sufficiently large. The result implies that increasing network connectivity beyond the minimum tree connectivity does not improve the performance of the all-to-all broadcast operation when the most efficient topology specific algorithm is used. All-to-all broadcast algorithms that achieve near-optimal performance are developed for clusters with cut-through and clusters with store-and-forward switches. We evaluate the algorithms through experiments and simulations. The empirical results confirm our theoretical finding. 相似文献

4.

Bandwidth optimal all-reduce algorithms for clusters of workstations 总被引：1，自引：0，他引：1

Pitch Patarasuk Xin Yuan 《Journal of Parallel and Distributed Computing》2009

We consider an efficient realization of the all-reduce operation with large data sizes in cluster environments, under the assumption that the reduce operator is associative and commutative. We derive a tight lower bound of the amount of data that must be communicated in order to complete this operation and propose a ring-based algorithm that only requires tree connectivity to achieve bandwidth optimality. Unlike the widely used butterfly-like all-reduce algorithm that incurs network contention in SMP/multi-core clusters, the proposed algorithm can achieve contention-free communication in almost all contemporary clusters, including SMP/multi-core clusters and Ethernet switched clusters with multiple switches. We demonstrate that the proposed algorithm is more efficient than other algorithms on clusters with different nodal architectures and networking technologies when the data size is sufficiently large. 相似文献

5.

A Study of Process Arrival Patterns for MPI Collective Operations 总被引：1，自引：0，他引：1

Ahmad Faraj Pitch Patarasuk Xin Yuan 《International journal of parallel programming》2008,36(6):543-570

Process arrival pattern, which denotes the timing when different processes arrive at an MPI collective operation, can have a significant impact on the performance of the operation. In this work, we characterize the process arrival patterns in a set of MPI programs on two common cluster platforms, use a micro-benchmark to study the process arrival patterns in MPI programs with balanced loads, and investigate the impacts of different process arrival patterns on collective algorithms. Our results show that (1) the differences between the times when different processes arrive at a collective operation are usually sufficiently large to affect the performance; (2) application developers in general cannot effectively control the process arrival patterns in their MPI programs in the cluster environment: balancing loads at the application level does not balance the process arrival patterns; and (3) the performance of collective communication algorithms is sensitive to process arrival patterns. These results indicate that process arrival pattern is an important factor that must be taken into consideration in developing and optimizing MPI collective routines. We propose a scheme that achieves high performance with different process arrival patterns, and demonstrate that by explicitly considering process arrival pattern, more efficient MPI collective routines than the current ones can be obtained. 相似文献

6.

A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters 总被引：1，自引：0，他引：1

Ahmad Faraj Xin Yuan Patarasuk P. 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(2):264-276

We develop a message scheduling scheme for efficiently realizing all-to-all personalized communication (AAPC) on Ethernet switched clusters with one or more switches. To avoid network contention and achieve high performance, the message scheduling scheme partitions AAPC into phases such that 1) there is no network contention within each phase and 2) the number of phases is minimum. Thus, realizing AAPC with the contention-free phases computed by the message scheduling algorithm can potentially achieve the minimum communication completion time. In practice, phased AAPC schemes must introduce synchronizations to separate messages in different phases. We investigate various synchronization mechanisms and various methods for incorporating synchronizations into the AAPC phases. Experimental results show that the message scheduling-based AAPC implementations with proper synchronization consistently achieve high performance on clusters with many different network topologies when the message size is large 相似文献

7.

Panic and paranoia

PC Bermanzohn PB Arlow RJ Pitch SG Siris 《Canadian Metallurgical Quarterly》1997,58(7):325-326

相似文献