期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improved upper and lower bounds on the optimization of mixed chordal ring networks

James K. Lan 《Information Processing Letters》2009,109(13):757-882

Recently, Chen, Hwang and Liu [S.K. Chen, F.K. Hwang, Y.C. Liu, Some combinatorial properties of mixed chordal rings, J. Interconnection Networks 1 (2003) 3-16] introduced the mixed chordal ring network as a topology for interconnection networks. In particular, they showed that the amount of hardware and the network structure of the mixed chordal ring network are very comparable to the (directed) double-loop network, yet the mixed chordal ring network can achieve a better diameter than the double-loop network. More precisely, the mixed chordal ring network can achieve diameter about as compared to for the (directed) double-loop network, where N is the number of nodes in the network. One of the most important questions in interconnection networks is, for a given number of nodes, how to find an optimal network (a network with the smallest diameter) and give the construction of such a network. Chen et al. [S.K. Chen, F.K. Hwang, Y.C. Liu, Some combinatorial properties of mixed chordal rings, J. Interconnection Networks 1 (2003) 3-16] gave upper and lower bounds for such an optimization problem on the mixed chordal ring network. In this paper, we improve the upper and lower bounds as and , respectively. In addition, we correct some deficient contexts in [S.K. Chen, F.K. Hwang, Y.C. Liu, Some combinatorial properties of mixed chordal rings, J. Interconnection Networks 1 (2003) 3-16]. 相似文献

2.

A New Topology with Odd Degree for Multiprocessor Systems

《Journal of Parallel and Distributed Computing》1996,39(1):87-94

A new topology for interconnection networks has been proposed. The underlying network graph hasN= 4ⁿnodes (n≥ 2) and isalmostregular with maximum degree 5 and diameter ≤ ⌊3/4 log₂N⌋ + 1. Algorithms for point-to-point routing and single node broadcast have also been developed. It has also been shown that various algorithms for real life applications, e.g., matrix transpose, matrix multiplication, finding the sum/average/maximum/minimum of a set of data elements and ASCEND/DESCEND types of algorithms can be efficiently implemented on this topology. Finally, the underlying idea of constructing this network has been generalized to define a family ofalmost regularodd degree graphs of maximum degree 2j+ 1, (j> 2) withN= (2j)ⁿnodes and diameter ⌊3/4 log_jN⌋ + 1. 相似文献

3.

Practical constructive schemes for deterministic shared-memory access

A. Pietracaprina F. P. Preparata 《Theory of Computing Systems》1997,30(2):3-37

We present three explicit schemes for distributingM variables amongN memory modules, whereM=Θ(N ^1.5),M = Θ(N ²), andM=Θ(N ³), respectively. Each variable is replicated into a constant number of copies stored in distinct modules. We show thatN processors, directly accessing the memories through a complete interconnection, can read/write any set ofN variables in worst-case timeO (N ^1/3),O(N ^1/2), andO(N ^2/3), respectively for the three schemes. The access times for the last two schemes are optimal with respect to the particular redundancy values used by such schemes. The address computation can be carried out efficiently by each processor without recourse to a complete memory map and requiring onlyO(1) internal storage. 相似文献

4.

A novel constant degree and constant congestion DHT scheme for peer-to-peer networks 总被引：3，自引：0，他引：3

LI Dongsheng & LU Xicheng School of Computer National University of Defense Technology Changsha China 《中国科学F辑(英文版)》2005,48(4):421-436

1 Introduction and related work In recent years, peer-to-peer computing has attracted significant attention from both industry field and academic field[1-3]. The core component of many proposed peer-to- peer systems is the distributed hash table (DHT) schemes[4,5] that use a hash table-like interface to publish and look up data objects. Many proposed DHT schemes[6-15] are based on some traditional interconnection to- pology: Chord[6], Tapestry[7,8], Pastry[9] are based on hypercube topolog… 相似文献

5.

A delay optimal coterie on the k-dimensional folded Petersen graph

《Journal of Parallel and Distributed Computing》2003,63(11):1026-1035

Coteries are an effective tool for enforcing mutual exclusion in distributed systems. Communication delay is an important metric to measure the performance of a coterie. The topology of the interconnection network in a distributed system also has an impact on its performance. The k-dimensional folded Petersen graph, a graph with 10^k nodes and diameter 2k, qualifies as a good network topology for large distributed systems. In this paper, we present a delay optimal coterie on the k-dimensional folded Petersen graph, FP_k. For any positive integer k, the coterie has message complexity 4^k and delay k. Moreover, this coterie is not vote-assignable. 相似文献

6.

Packet Routing and PRAM Emulation on Star Graphs and Leveled Networks

《Journal of Parallel and Distributed Computing》1994,20(2):145-157

We consider the problem of permutation routing on a star graph, an interconnection network which has better properties than the hypercube. In particular, its degree and diameter are sublogarithmic in the network size. We present optimal randomized routing algorithms that run in O(D) steps (where D is the network diameter) for the worst-case input with high probability. We also show that for the n-way shuffle network with N = nⁿ nodes, there exists a randomized routing algorithm which runs in O(n) time with high probability. Another contribution of this paper is a universal randomized routing algorithm that could do optimal routing for a large class of networks (called leveled networks) which includes the star graph. The associative analysis is also network-independent. In addition, we present a deterministic routing algorithm, for the star graph, which is near optimal. All the algorithms we give are oblivious. As an application of our routing algorithms, we also show how to emulate a PRAM optimally on this class of networks. 相似文献

7.

Multi-mesh of trees with its parallel algorithms

《Journal of Systems Architecture》2004,50(4):193-206

In recent years the multi-mesh network [Proceedings of the Ninth International Parallel Processing Symposium, Santa Barbara, CA, April 25–28, 1995, 17; IEEE Trans. on Comput. 68 (5) (1999) 536] has created a lot of interests among the researchers for its efficient topological properties. Several parallel algorithms for various trivial and nontrivial problems have been mapped on this network. However, because of its O(n) diameter, a large class of algorithms that involves frequent data broadcast in a row or in a column or between the diametrically opposite processors, requires O(n) time on an n×n multi-mesh. In search of faster algorithms, we introduce, in this paper, a new network topology, called multi-mesh of trees. This network is built around the multi-mesh network and the mesh of trees. As a result it can perform as efficiently as a multi-mesh network and also as efficiently as a mesh of trees. Several topological properties, including number of links, diameter, bisection width and decomposition are discussed. We present the parallel algorithms for finding sum of n⁴ elements and the n²-point Lagrange interpolation both in O(logn)¹ time. The solution of n²-degree polynomial equation, n²-point DFT computation and sorting of n² elements are all shown to run in O(logn) time too. The communication algorithms one-to-all, row broadcast and column broadcast are also described in O(logn) time. This can be compared with O(n) time algorithms on multi-mesh network for all these problems. 相似文献

8.

Uniform minimal full-access networks

《Journal of Parallel and Distributed Computing》1988,5(4):383-403

Minimal Full-Access (MFA) networks are the class of all interconnection networks for N = 2ⁿ inputs and outputs that require a minimum number of switching elements and provide full access capability. In this paper, MFA networks with 2 × 2 switching elements that use the same interconnection pattern between the stages are studied; such networks are called uniform MFA (UMFA) networks. Most of the known networks, including the Omega, the binary n-cube, and the regular (2, 2)-banyan, belong to this class. A graph-theoretic approach is used to study the class of UMFA networks, and a procedure is described to derive topologically nonequivalent networks from the Omega network. It is shown that the number of UMFA networks grow exponentially with N, and a lower bound of about 2^N/32 is obtained for N ⩾ 32. We also outline an extension of our methods to derive similar bounds for networks using k × k switches, for k ⩾ 3. 相似文献

9.

The hierarchical Petersen network: a new interconnection network with fixed degree

Jung-Hyun Seo Jong-Seok Kim Hyung Jae Chang Hyeong-Ok Lee 《The Journal of supercomputing》2018,74(4):1636-1654

Network cost and fixed-degree characteristic for the graph are important factors to evaluate interconnection networks. In this paper, we propose hierarchical Petersen network (HPN) that is constructed in recursive and hierarchical structure based on a Petersen graph as a basic module. The degree of HPN(n) is 5, and HPN(n) has \(10^n\) nodes and \(2.5 \times 10^n\) edges. And we analyze its basic topological properties, routing algorithm, diameter, spanning tree, broadcasting algorithm and embedding. From the analysis, we prove that the diameter and network cost of HPN(n) are \(3\log _{10}N-1\) and \(15 \log _{10}N-1\), respectively, and it contains a spanning tree with the degree of 4. In addition, we propose link-disjoint one-to-all broadcasting algorithm and show that HPN(n) can be embedded into FP\(_k\) with expansion 1, dilation 2k and congestion 4. For most of the fixed-degree networks proposed, network cost and diameter require \(O(\sqrt{N})\) and the degree of the graph requires O(N). However, HPN(n) requires O(1) for the degree and \(O(\log _{10}N)\) for both diameter and network cost. As a result, the suggested interconnection network in this paper is superior to current fixed-degree and hierarchical networks in terms of network cost, diameter and the degree of the graph. 相似文献

10.

Fast algorithms for the conjugate periodic function

Dr. M. H. Gutknecht 《Computing》1979,22(1):79-91

Two fast algorithms for the approximate computation of the conjugate periodic function are described. They are based on the fast Fourier transform and enable us to reduce the expenses toO (N logN) operations compared withO (N ²) operations for Wittich's classical method. The second algorithm, for which an ALGOL 60 procedure is listed, allows to evaluate the conjugate function on the even (or odd) numbered lattice points separately. (This feature is important for some applications.) 相似文献

11.

Approximate algorithms for partitioning problems

Mohammad Ashraf Iqbal 《International journal of parallel programming》1991,20(5):341-361

We consider the problem of optimally assigning the modules of a parallel/pipelined program over the processors of a multiple processor system under certain restrictions on the interconnection structure of the program as well as the multiple computer system. We show that for a variety of such problems, it is possible to find if a partition of the modular program exists in which the load on any processor is whithin a certain bound. This method when combined with a binary search over a fixed range, provides an optimal solution to the partitioning problem.The specific problems we consider are partitioning of (1) a chain structured parallel program over a chain-like computer system, (2) multiple chain-like programs over a host-satellite system, and (3) a tree structured parallel program over a host-satellite system.For a problem withN modules andM processors, the complexity of our algorithm is no worse thanO(Mlog(N)log(W _T/)), whereW _T is the cost of assigning all modules to one processors, and the desired accuracy. This algorithm provides an improvement over the recently developed best known algorithm that runs inO(MNlog(N)) time.This Research was supported by a grant from the Division of Research Extension and Advisory Services, University of Engineering and Technology Lahore, Pakistan. Further support was provided by NASA Contracts NAS1-17070 and NAS1-18107 while the author was resident at the Institute for Computer Applications in Science and Engineering (ICASE), NASA Langley Research Center, Hampton, Virginia, USA. 相似文献

12.

How to find Steiner minimal trees in euclideand-space

Warren D. Smith 《Algorithmica》1992,7(1-6):137-177

This paper has two purposes. The first is to present a new way to find a Steiner minimum tree (SMT) connectingN sites ind-space,d >- 2. We present (in Appendix 1) a computer code for this purpose. This is the only procedure known to the author for finding Steiner minimal trees ind-space ford > 2, and also the first one which fits naturally into the framework of “backtracking” and “branch-and-bound.” Finding SMTs of up toN = 12 general sites ind-space (for anyd) now appears feasible. We tabulate Steiner minimal trees for many point sets, including the vertices of most of the regular and Archimedeand-polytopes with <- 16 vertices. As a consequence of these tables, the Gilbert-Pollak conjecture is shown to be false in dimensions 3–9. (The conjecture remains open in other dimensions; it is probably false in all dimensionsd withd ≥ 3, but it is probably true whend = 2.) The second purpose is to present some new theoretical results regarding the asymptotic computational complexity of finding SMTs to precision ?. We show that in two-dimensions, Steiner minimum trees may be found exactly in exponential time O(C ^N) on a real RAM. (All previous provable time bounds were superexponential.) If the tree is only wanted to precision ?, then there is an (N/?)^O(√N)-time algorithm, which is subexponential if 1/? grows only polynomially withN. Also, therectilinear Steiner minimal tree ofN points in the plane may be found inN ^O(√N) time. J. S. Provan devised an O(N ⁶/?⁴)-time algorithm for finding the SMT of a convexN-point set in the plane. (Also the rectilinear SMT of such a set may be found in O(N ⁶) time.) One therefore suspects that this problem may be solved exactly in polynomial time. We show that this suspicion is in fact true—if a certain conjecture about the size of “Steiner sensitivity diagrams” is correct. All of these algorithms are for a “real RAM” model of computation allowing infinite precision arithmetic. They make no probabilistic or other assumptions about the input; the time bounds are valid in the worst case; and all our algorithms may be implemented with a polynomial amount of space. Only algorithms yielding theexact optimum SMT, or trees with lengths ≤ (1 + ?) × optimum, where ? is arbitrarily small, are considered here. 相似文献

13.

Optical clustering

Frank Dehne 《The Visual computer》1986,2(1):39-43

This paper presents a definition of ‘optical clusters’ which is derived from the concept of optical resolution. The clustering problem (induced by this definition) is transformed such that the application of well known Computational Geometry methods yields efficient solutions. One result (which can be extended to different classes of objects and metrices) is the following: Given a setS ofN disjoint line segments inE ².

The optical clusters with respect to a given separation parameterr∈R can be computed in timeO(Nlog² N).
Given an interval [a, b] for the numberm(S, r) of optical clusters which we want to compute, then timeO(N log² N)[O(Nlog² N+CN)] suffices to compute the interval [R(b),R(a)]={r∈R/m(S,r)∈[a,b]} [allC optical clusterings withR(b)≦ r≦R(a)].

相似文献

14.

Efficient elections in chordal ring networks

Hagit Attiya Jan van Leeuwen Nicola Santoro Shmuel Zaks 《Algorithmica》1989,4(1-4):437-446

We study the message complexity of the problem of distributively electing a leader in chordal rings. Such networks consist of a basic ring with additional links, the extreme cases being the oriented ring and the complete graph with a full sense of direction. We present a general election algorithm for these networks, and prove its optimality. As a corollary, we show thatO(logn) chords at each processor suffice to obtain an algorithm that uses at mostO(n) messages; this improves and extends a previous work, where an algorithm, also usingO(n) messages, was suggested for the case where alln-1 chords exist (the oriented complete network). 相似文献

15.

Compact Routing on Chordal Rings of Degree 4

L. Narayanan J. Opatrny 《Algorithmica》1999,23(1):72-96

A chordal ring G(n;c) of degree 4 is a ring of n nodes with chords connecting each vertex i to the vertex (i + c) mod n . In this paper we investigate compact routing schemes on such networks. We show an optimal boolean routing scheme for any such network that requires O( log n) bits of storage at each node, and O(1) time to compute a shortest path to any destination. This improves on the results of [16] which gives a linear time algorithm for such networks and [6] where efficient routing schemes for certain fixed values of c were developed. Further, we show several bounds on interval routing schemes for such networks. We show that while every chordal ring has an optimal interval routing scheme with at most intervals on any edge, there exist chordal rings for which any optimal interval routing scheme that labels the vertices around the ring in the graph requires intervals on some edges. Additionally, there are chordal rings which admit no optimal one-interval routing schemes, regardless of the vertex labeling. We also consider interval routing schemes under relaxed requirements for the lengths of paths. Received September 5, 1997; revised December 1, 1997. 相似文献

16.

Practical Constructive Schemes for Deterministic Shared-Memory Access

A. Pietracaprina F. P. Preparata 《Theory of Computing Systems》1997,30(1):3-37

We present three explicit schemes for distributingM variables amongN memory modules, whereM=Θ(N ^1.5),M = Θ(N ²), andM=Θ(N ³), respectively. Each variable is replicated into a constant number of copies stored in distinct modules. We show thatN processors, directly accessing the memories through a complete interconnection, can read/write any set ofN variables in worst-case timeO (N ^1/3),O(N ^1/2), andO(N ^2/3), respectively for the three schemes. The access times for the last two schemes are optimal with respect to the particular redundancy values used by such schemes. The address computation can be carried out efficiently by each processor without recourse to a complete memory map and requiring onlyO(1) internal storage. This paper was partially supported by NFS Grants CCR-91-96152 and CCR-94-00232, by ONR Contract N00014-91-J-4052, ARPA Order 8225, and by the ESPRIT III Basic Research Programme of the EC under Contract No. 9072 (Project GEPPCOM). Results reported here were presented in preliminary form at the 10th Symposium on Theoretical Aspects of Computer Science (Würzburg, Germany, 1993), and at the 5th ACM Symposium on Parallel Algorithms and Architectures (Velen, Germany, 1993). 相似文献

17.

O(n) routing in rearrangeable networks

《Journal of Systems Architecture》2000,46(6):529-542

In (2n−1)-stage rearrangeable networks, the routing time for any arbitrary permutation is Ω(n²) compared to its propagation delay O(n) only. Here, we attempt to identify the sets of permutations, which are routable in O(n) time in these networks. We define four classes of self-routable permutations for Benes network. An O(n) algorithm is presented here, that identifies if any permutation P belongs to one of the proposed self-routable classes, and if yes, it also generates the necessary control vectors for routing P. Therefore, the identification, as well as the switch setting, both problems are resolved in O(n) time by this algorithm. It covers all the permutations that are self-routable by anyone of the proposed techniques. Some interesting relationships are also explored among these four classes of permutations, by applying the concept of ‘group-transformations’ [N. Das, B.B. Bhattacharya, J. Dattagupta, Hierarchical classification of permutation classes in multistage interconnection networks, IEEE Trans. Comput. (1993) 665–677] on these permutations. The concepts developed here for Benes network, can easily be extended to a class of (2n−1)-stage networks, which are topologically equivalent to Benes network. As a result, the set of permutations routable in a (2n−1)-stage rearrangeable network, in a time comparable to its propagation delay has been extended to a large extent. 相似文献

18.

The design and analysis of an efficient load balancing algorithm employing the symmetric balanced incomplete block design

Okbin Lee 《Information Sciences》2006,176(15):2148-2160

In order to maintain load balancing in a distributed network, each node should obtain workload information from all the nodes in the network. To accomplish this, this processing requires O(v²) communication complexity, where v is the number of nodes. First, we present a new synchronous dynamic distributed load balancing algorithm on a (v, k + 1, 1)-configured network applying a symmetric balanced incomplete block design, where v = k² + k + 1. Our algorithm designs a special adjacency matrix and then transforms it to (v, k + 1, 1)-configured network for an efficient communication. It requires only communication complexity and each node receives workload information from all the nodes without redundancy since each link has the same amount of traffic for transferring workload information. Later, this algorithm is revised for distributed networks and is analyzed in terms of efficiency of load balancing. 相似文献

19.

A new scheme for the deterministic simulation of PRAMs in VLSI 总被引：3，自引：0，他引：3

F. Luccio A. Pietracaprina G. Pucci 《Algorithmica》1990,5(1):529-544

A deterministic scheme for the simulation of (n, m)-PRAM computation is devised. Each PRAM step is simulated on a bounded degree network consisting of a mesh-of-trees (MT) of siden. The memory is subdivided inn modules, each local to a PRAM processor. The roots of the MT contain these processors and the memory modules, while the otherO(n ²) nodes have the mere capabilities of packet switchers and one-bit comparators. The simulation algorithm makes a crucial use of pipelining on the MT, and attains a time complexity ofO(log² n/log logn). The best previous time bound wasO(log² n) on a different interconnection network withn processors. While the previous simulation schemes use an intermediate MPC model, which is in turn simulated on a bounded degree network, our method performs the simulation directly with a simple algorithm.This work has been supported in part by Ministero della Pubblica Istruzione of Italy under a research grant. 相似文献

20.

Periodically regular chordal rings

Parhami B. Ding-Ming Kwai 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(6):658-672

Chordal rings have been proposed in the past as networks that combine the simple routing framework of rings with the lower diameter, wider bisection, and higher resilience of other architectures. Virtually all proposed chordal ring networks are node-symmetric, i.e., all nodes have the same in/out degree and interconnection pattern. Unfortunately, such regular chordal rings are not scalable. In this paper, periodically regular chordal (PRC) ring networks are proposed as a compromise for combining low node degree with small diameter. By varying the PRC ring parameters, one can obtain architectures with significantly different characteristics (e.g., from linear to logarithmic diameter), while maintaining an elegant framework for computation and communication. In particular, a very simple and efficient routing algorithm works for the entire spectrum of PRC rings thus obtained. This flexibility has important implications for key system attributes such as architectural satiability, software portability, and fault tolerance. Our discussion is centered on unidirectional PRC rings with in/out-degree of 2. We explore the basic structure, topological properties, optimization of parameters, VLSI layout, and scalability of such networks, develop packet and wormhole routing algorithms for them, and briefly compare them to competing fixed-degree architectures such as symmetric chordal rings, meshes, tori, and cube-connected cycles 相似文献