期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Packet Routing and PRAM Emulation on Star Graphs and Leveled Networks

《Journal of Parallel and Distributed Computing》1994,20(2):145-157

We consider the problem of permutation routing on a star graph, an interconnection network which has better properties than the hypercube. In particular, its degree and diameter are sublogarithmic in the network size. We present optimal randomized routing algorithms that run in O(D) steps (where D is the network diameter) for the worst-case input with high probability. We also show that for the n-way shuffle network with N = nⁿ nodes, there exists a randomized routing algorithm which runs in O(n) time with high probability. Another contribution of this paper is a universal randomized routing algorithm that could do optimal routing for a large class of networks (called leveled networks) which includes the star graph. The associative analysis is also network-independent. In addition, we present a deterministic routing algorithm, for the star graph, which is near optimal. All the algorithms we give are oblivious. As an application of our routing algorithms, we also show how to emulate a PRAM optimally on this class of networks. 相似文献

2.

Randomized Selection on the Hypercube

Sanguthevar Rajasekaran 《Journal of Parallel and Distributed Computing》1996,37(2):187

In this paper, we present randomized algorithms for selection on the hypercube. We identify two variants of the hypercube, namely, thesequential modeland theparallel model. In the sequential model, any node at any time can handle only communication along a single incident edge, whereas in the parallel model a node can communicate along all its incident edges at the same time. We specify three variations of the parallel model and present optimal randomized algorithms on all these three versions of parallel model. In particular, we show that selection on an input of sizencan be performed on ap-node hypercube in timeO((n/p) + logp) with high probability, on any of the three versions of the parallel model. This result is important in view of a lower bound that implies that selection needs Ω((n/p)log logp+ logp) time on ap-node sequential hypercube. We modify our selection algorithm to run on the sequential hypercube in which case it runs in an expected time nearly matching this lower bound. For the special case whenn=p, our selection algorithm runs in an optimalO(logn) time on the sequential hypercube. Our algorithms are very simple and are most likely to perform well in practice. 相似文献

3.

Computing Large Subcubes in Residual Hypercubes

Sridhar M. A. Raghavendra C. S. 《Journal of Parallel and Distributed Computing》1995,24(2)

相似文献

4.

An Efficient Algorithm for Gray-to-Binary Permutation on Hypercubes

《Journal of Parallel and Distributed Computing》1994,20(1):114-120

Both Gray code and binary code are frequently used in mapping arrays into hypercube architectures. While the former is preferred when communication between adjacent array elements is needed, the latter is preferred for FFT-type communication. When different phases of computations have different types of communication patterns, the need arises to remap the data. We give a nearly optimal algorithm for permuting data from a Gray code mapping to a binary code mapping on a hypercube with communication restricted to one input and one output channel per node at a time. Our algorithm improves over the best previously known algorithm (J. Parallel Distrib. Comput. 4, 2 (Apr. 1987), 133-172) by nearly a factor of two and is optimal to within a factor of n(n − 1) with respect to data transfer time on an n-cube. The expected speedup is confirmed by measurements on an Intel iPSCI2 hypercube. 相似文献

5.

Cryptanalysis of RSA with two decryption exponents

Santanu Sarkar 《Information Processing Letters》2010,110(5):178-340

In this paper, we consider RSA with N=pq, where p,q are of same bit size, i.e., q<p<2q. We study the weaknesses of RSA when multiple encryption and decryption exponents are considered with same RSA modulus N. A decade back, Howgrave-Graham and Seifert (CQRE 1999) studied this problem in detail and presented the bounds on the decryption exponents for which RSA is weak. For the case of two decryption exponents, the bound was N^0.357. We have exploited a different lattice based technique to show that RSA is weak beyond this bound. Our analysis provides improved results and it shows that for two exponents, RSA is weak when the RSA decryption exponents are less than N^0.416. Moreover, we get further improvement in the bound when some of the most significant bits (MSBs) of the decryption exponents are same (but unknown). 相似文献

6.

A fast pessimistic one-step diagnosis algorithm for hypercube multicomputer systems

《Journal of Parallel and Distributed Computing》2004,64(4):546-553

This paper describes a system-level diagnosis algorithm for hypercube multicomputer systems. The algorithm is based on the PMC model and can isolate all faulty processors to within a set that contains at most one fault-free processor. If we denote by N the total number of processors in a hypercube system to be diagnosed, then, based on the judiciously designed data structures, the algorithm can run in O(Nlog₂N) time; whereas the best-known diagnosis algorithm, the YML algorithm, runs in O(N^2.5) time. Consequently, the new algorithm is remarkably superior to the YML algorithm in terms of the time cost. 相似文献

7.

Randomized parallel list ranking for distributed memory multiprocessors

Frank Dehne Siang W. Song 《International journal of parallel programming》1997,25(1):1-16

We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP type model. We first describe a simple version which requires, with high probability, log(3p)+log ln(n)=Õ(logp+log logn) communication rounds (h-relations withh=Õ(n/p)) andÕ(n/p)) local computation. We then outline an improved version that requires high probability, onlyr?(4k+6) log(2/3p)+8=Õ(k logp) communication rounds wherek=min{i?0 |ln(i+1)n?(2/3p)²ⁱ⁺¹}. Notekn) is an extremely small number. Forn andp?4, the value ofk is at most 2. Hence, for a given number of processors,p, the number of communication rounds required is, for all practical purposes, independent ofn. Forn?1, 500,000 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Forn?10010100 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 118; and we conjecture that the actual number of communication rounds required will not exceed 50. Our algorithm has a considerably smaller member of communication rounds than the list ranking algorithm used in Reid-Miller’s empirical study of parallel list ranking on the Cray C-90.⁽¹⁾ To our knowledge, Reid-Miller’s algorithm⁽¹⁾ was the fastest list ranking implementation so far. Therefore, we expect that our result will have considerable practical relevance. 相似文献

8.

Design and analysis of the rotational binary graph as an alternative to hypercube and Torus

Seo Jung-hyun Lee HyeongOk 《The Journal of supercomputing》2020,76(9):7161-7176

Network cost is equal to degree?×?diameter and is one of the important measurements when evaluating graphs. Torus and hypercube are very well-known graphs. When these graphs expand, a Torus has an advantage in that its degree does not increase. A hypercube has a shorter diameter than that of other graphs, because when the graph expands, the diameter increases by 1. Hypercube Q_n has 2ⁿ nodes, and its diameter is n. We propose the rotational binary graph (RBG), which has the advantages of both hypercube and Torus. RBG_n has 2ⁿ nodes and a degree of 4. The diameter of RBG_n would be 1.5n?+?1. In this paper, we first examine the topology properties of RBG. Second, we construct a binary spanning tree in RBG. Third, we compare other graphs to RBG considering network cost specifically. Fourth, we suggest a broadcast algorithm with a time complexity of 2n???2. Finally, we prove that RBG_n embedded into hypercube Q_n results in dilation n, and expansion 1, and congestion 7.

相似文献

9.

Mutual Exclusion on a Hypercube

《Journal of Parallel and Distributed Computing》1993,17(4):327-336

We present a decentralized, symmetric mutual exclusion algorithm that is tailored to the hypercube architecture. Our algorithm has better time complexity than a suitable hypercube adaptation of Maekawa′s O(√N) Mutual Exclusion algorithm. We show that, in the presence of contention, our algorithm has a shorter Blocking Delay than Maekawa′s algorithm. In the absence of contention, our algorithm achieves optimal Round-trip Delay under both the n-port and the one-port communication models. 相似文献

10.

Fast Parallel Algorithms for Solving Triangular Systems of Linear Equations on the Hypercube

Ibarra O. H. Kim M. H. 《Journal of Parallel and Distributed Computing》1994,20(3)

This paper presents efficient hypercube algorithms for solving triangular systems of linear equations by using various matrix partitioning and mapping schemes. Recently, several parallel algorithms have been developed for this problem. In these algorithms, the triangular solver is treated as the second stage of Gauss elimination. Thus, the triangular matrix is distributed by columns (or rows) in a wrap fashion since it is likely that the matrix is distributed this way after an LU decomposition has been done on the matrix. However, the efficiency of the algorithms is low. Our motivation is to develop various data partitioning and mapping schemes for hypercube algorithms by treating the triangular solver as an independent problem. Theoretically, the computation time of our best algorithm is ((12p + 1)n² + 36p³ − 28p²)/(24p²), and an upper bound on the communication time is 2αp log p (log n − log p) + 2α(log n − log p − 1) log p + (cn/p − 2c)(2 log p − 1) + log p(cn − c − α), where α is the (communication startup time)/(one entry scanning time), c is a constant, n is the order of the triangular system and p is the number of nodes in the hypercube. Experimental results show that the algorithm is efficient. The efficiency of the algorithm is 0.945 when p = 2, n = 513, and 0.93 when p = 8, n = 1025. 相似文献

11.

Average-case analysis of the Modified Harmonic algorithm

Prakash Ramanan Kazuhiro Tsuga 《Algorithmica》1989,4(1-4):519-533

In this paper we analyze the average-case performance of the Modified Harmonic algorithm for on-line bin packing. We first analyze the average-case performance for arbitrary distribution of item sizes over (0,1]. This analysis is based on the following result. Letf ₁ andf ₂ be two linear combinations of random variables {N _i} _i=1 ^k where theN _i's have a joint multinomial distribution for eachn=σ _i=1 ^k ,N _i. LetE(f ₁) ≠ O andE(f ₂)≠ 0. Then lim_n _→∞E(max(_f ₁,f ₂))/n = lim_n→∞ max(E(f ₁),E(f ₂))/n. We then consider the special case when the item sizes are uniformly distributed over (0,1]. For specific values of the parameters, the Modified Harmonic algorithm turns out to be better than the other two linear-time on-line algorithms—Next Fit and Harmonic—in both the worst case as well as the average case. We also obtain optimal values for the parameters of the algorithm from the average-case standpoint. For these values of the parameters, the average-case performance ratio is less than 1.19. This compares well with the performance ratios 1.333. and 1.2865. of the Next Fit algorithm and the Harmonic algorithm, respectively. 相似文献

12.

Optimal Elections in Labeled Hypercubes

《Journal of Parallel and Distributed Computing》1996,33(1):76-83

We study the message complexity of theElectionProblem in hypercube networks, when the processors have a “Sense of Direction,” i.e., the capability to distinguish between adjacent communication links according to some globally consistent scheme. We present two models of Sense of Direction, which differ regarding the way the labeling of the links in the network is done: either by matching based on dimensions or by distance along a Hamiltonian cycle. In the dimension model, we give an optimal linear algorithm which uses the natural dimensional labeling of the communication links. We prove that, in the distance-based case, the graph symmetry of the hypercube is broken and, thus, the leader Election does not require a global maximum-finding algorithm:O(1) messages suffice to select the leader, whereas the Θ(N) messages are required only for the final broadcasting. Finally, we study the communication cost of changing one orientation labeling to the other and prove thatO(N) messages suffice. 相似文献

13.

Computing the subset partial order for dense families of sets

Amr Elmasry 《Information Processing Letters》2009,109(18):1082-1086

We give an algorithm to compute the subset partial order (called the subset graph) for a family F of sets containing k sets with N elements in total and domain size n. Our algorithm requires O(nk²/logk) time and space on a Pointer Machine. When F is dense, i.e. N=Θ(nk), the algorithm requires O(N²/log²N) time and space. We give a construction for a dense family whose subset graph is of size Θ(N²/log²N), indicating the optimality of our algorithm for dense families. The subset graph can be dynamically maintained when F undergoes set insertions and deletions in O(nk/logk) time per update (that is sub-linear in N for the case of dense families). If we assume words of b?k bits, allow bits to be packed in words, and use bitwise operations, the above running time and space requirements can be reduced by a factor of blog(k/b+1)/logk and b²log(k/b+1)/logk respectively. 相似文献

14.

Construction d’approximations des espaces de sobolev sur des reseaux en simplexes

F. di Gugliemlo 《Calcolo》1969,6(2):279-331

The purpose of the present paper is to construct approximations of the So bolev spacesW ^m,p (Ω) by piecewise polynomial functions on sets of simplexes of the Euclidean spaceR ⁿ. These approximations are obtained by repeated convolution of the characteristic function of the unit hypercube ofR ⁿ with a positive measure. Although they involve polynomials of lower degree, they yield the same accuracy as the approximations on rectaugular nets usually adopted. Moreover when used for the approximate solution of variational boundary value problems, they lead to systems of linear equations with matrices where the diagonals with non-vanishing elements are less numerous; their number is reduced from (2m+1)ⁿ, for approximations on rectangular nets, to2(2m) ⁿ?(2m?1)ⁿ for approximations on simplexes nets. The gain thus realised cannot be improved when one requires that these approximations verify a commutation property with differentiation operators. 相似文献

15.

A (4n − 9)/3 diagnosis algorithm on n-dimensional cube network

Xiaofan Yang Yuan Yan Tang 《Information Sciences》2007,177(8):1771-1781

As a generalization of the precise and pessimistic diagnosis strategies of system-level diagnosis of multicomputers, the t/k diagnosis strategy can significantly improve the self-diagnosing capability of a system at the expense of no more than k fault-free processors (nodes) being mistakenly diagnosed as faulty. In the case k ? 2, to our knowledge, there is no known t/k diagnosis algorithm for general diagnosable system or for any specific system. Hypercube is a popular topology for interconnecting processors of multicomputers. It is known that an n-dimensional cube is (4n − 9)/3-diagnosable. This paper addresses the (4n − 9)/3 diagnosis of n-dimensional cube. By exploring the relationship between a largest connected component of the 0-test subgraph of a faulty hypercube and the distribution of the faulty nodes over the network, the fault diagnosis of an n-dimensional cube can be reduced to those of two constituent (n − 1)-dimensional cubes. On this basis, a diagnosis algorithm is presented. Given that there are no more than 4n − 9 faulty nodes, this algorithm can isolate all faulty nodes to within a set in which at most three nodes are fault-free. The proposed algorithm can operate in O(N log₂ N) time, where N = 2ⁿ is the total number of nodes of the hypercube. The work of this paper provides insight into developing efficient t/k diagnosis algorithms for larger k value and for other types of interconnection networks. 相似文献

16.

Edge-fault-tolerant vertex-pancyclicity of augmented cubes

Jung-Sheng Fu 《Information Processing Letters》2010,110(11):439-443

The n-dimensional augmented cube, denoted as AQn, a variation of the hypercube, possesses some properties superior to those of the hypercube. In this paper, we show that every vertex in AQn lies on a fault-free cycle of every length from 3 to n², even if there are up to n−1 edge faults. We also show that our result is optimal. 相似文献

17.

Routing Permutations on Hypercube Machines with Half-Duplex Links

《Journal of Parallel and Distributed Computing》1994,20(1):14-19

Algorithms are presented for realizing permutations on a less restrictive hypercube model called the S-MIMD (synchronous MIMD), which allows at most one data transfer on a given communication link at a given time instant, and where data movements are not restricted to a single dimension at a given time. First, an optimal algorithm for bit-permute permutations is developed from a very simple realization of the shuffle on a 3-cube; this algorithm needs 2⌊n/2⌋ routing steps on an n-dimensional hypercube. The technique is then extended to an optimal algorithm for bit-permute-complement permutations, one that needs n routing steps. Also, algorithms are sketched for routing permutations in the classes Ω and Ω⁻¹ in 3⌈n/2⌉ routing steps, yielding an off-line algorithm for routing arbitrary permutations in 3n steps. 相似文献

18.

Parallel Heap Operations on an EREW PRAM

《Journal of Parallel and Distributed Computing》1994,20(2):248-255

We consider parallel heap operations on the exclusive-read exclusive-write parallel random-access machine. We first present an O(n/p + log p) time parallel algorithm to construct a heap of n elements using p processors, which is optimal for p θ(n/log n). We then propose a parallel root deletion algorithm. In a preparatory step, a data structure for dynamic processor allocation is constructed using O((n/log n)^{1 − 1/k}) processors in O(log k) time for some constant k, 1 ≤ k ≤ ⌈log(n/log n)⌉. A sequence of root deletions can then be performed, each of which takes O((log n)/p + log p + log log n) time using p processors. Finally, we discuss a parallel algorithm running in O((log n)/p + log p) time for inserting an element into a heap, which is optimal for p = θ((log n)/log log n). Both deletion and insertion algorithms run in O(log log n) time when p = θ((log n)/log log n). 相似文献

19.

A Note on Optimal Time Broadcast in Faulty Hypercubes

《Journal of Parallel and Distributed Computing》1995,26(1):132-135

This note describes an algorithm for broadcasting a message on the n-dimensional hypercube in optimal time (n time units) and optimal communication (2ⁿ − 1 messages) in the presence of up to n − 2 arbitrary node or edge faults, assuming the set of faults is known to all nodes of the hypercube. 相似文献

20.

On the real complexity of a complex DFT

I. S. Sergeev 《Problems of Information Transmission》2017,53(3):284-293

We present a method to construct a theoretically fast algorithm for computing the discrete Fourier transform (DFT) of order N = 2ⁿ. We show that the DFT of a complex vector of length N is performed with complexity of 3.76875N log₂ N real operations of addition, subtraction, and scalar multiplication. 相似文献