期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal broadcast in all-port wormhole-routed hypercubes

Ching-Tien Ho Ming-Yang Kao 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(2):200-204

We give an optimal algorithm that broadcasts on an n-dimensional hypercube in O(n/ log₂ (n+1)) routing steps with wormhole, e-cube routing and all-port communication. Previously, the best algorithm of P.K. McKinley and C. Trefftz (1993) requires [n/2] routing steps. We also give routing algorithms that achieve tight time bounds for n ⩽7 相似文献

2.

Near-optimal broadcast in all-port wormhole-routed hypercubes usingerror-correcting codes

Ko H. Latifi S. Srimani P.K. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(3):247-260

A new broadcasting method is presented for hypercubes with wormhole routing mechanism. The communication model assumed allows an n-dimensional hypercube to have at most n concurrent 110 communications along its ports. It further assumes a distance insensitivity of (n+1) with no intermediate reception capability for the nodes along the communication path. The approach is based on determination of the set of nodes (called stations) in the hypercube such that for any node in the network there is a station at distance of at most 1. Once stations are identified, parallel disjoint paths are formed from the source to all stations. The broadcasting is accomplished first by sending the message to all stations which will in turn inform the rest of the nodes of the message. To establish node-disjoint paths between the source node and all stations, we introduce a new routing strategy. We prove that multicasting can be done in one routing step as long as the number of destination nodes are at most n in an n-dimensional hypercube. The number of broadcasting steps using our routing is equal to or smaller than that obtained in an earlier work; this number is optimal for all hypercube dimensions n⩽12, except for n=10 相似文献

3.

Optimal multicast communication in wormhole-routed torus networks

Robinson D.F. McKinley P.K. Cheng B.H.C. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(10):1029-1042

This paper presents efficient algorithms that implement one-to-many, or multicast, communication in wormhole-routed torus networks. By exploiting the properties of the switching technology and the use of virtual channels, a minimum-time multicast algorithm is presented for n-dimensional torus networks that use deterministic, dimension-ordered routing of unicast messages. The algorithm can deliver a multicast message to m-1 destinations in [log₂ m] message-passing steps, while avoiding contention among the constituent unicast messages. Performance results of a simulation study on torus networks with up to 4096 nodes are also given 相似文献

4.

Communication modeling of multicast in all-port wormhole-routed NoCs

Mahmoud Moadeli^{Author Vitae} Wim Vanderbauwhede Author Vitae 《Journal of Systems and Software》2010,83(8):1327-1336

Multicast is one of the most frequently used collective communication operations in multi-core SoC platforms. Bus as the traditional interconnect architecture for SoC development has been highly efficient in delivering multicast messages. Since the bus is non-scalable, it can not address the bandwidth requirements of the large SoCs. The networks on-chip (NoCs) emerged as a scalable alternative to address the increasing communication demands of such systems. However, due to its hop-to-hop communication, the NoCs may not be able to deliver multicast operations as efficiently as buses do. Adopting multi-port routers has been an approach to improve the performance of the multicast operations in interconnection networks. This paper presents a novel analytical model to compute communication latency of the multicast operation in wormhole-routed interconnection networks employing asynchronous multi-port routers scheme. The model is applied to the Quarc NoC and its validity is verified by comparing the model predictions against the results obtained from a discrete-event simulator developed using OMNET++. 相似文献

5.

Task migration in all-port wormhole-routed 2D mesh multicomputers

Nen-Chung Wang Tzung-Shi Chen 《Information Sciences》2006,176(22):3409-3425

In a mesh multicomputer, submeshes are allocated to perform jobs according to processor allocation schemes, with each task assigned to occupy processors of one submesh with an appropriate size. To assign regions for incoming tasks, task compaction is needed to produce a large contiguous free region. The overhead of task compaction relies mainly on designing an efficient task migration scheme. This paper investigates task migration schemes in 2D wormhole-routed mesh multicomputers with an all-port communication model. Two constraints are given between two submeshes for task migration. Two task migration schemes that follow one of the constraints in 2D mesh multicomputers are then developed. In addition, the proposed schemes are proven to be deadlock-free and congestion-free. Finally, performance analysis is adopted to compare the proposed task migration schemes. 相似文献

6.

Multicast communication in wormhole-routed 2D torus networks with hamiltonian cycle model

Neng-Chung Wang Yi-Ping Hung 《Journal of Systems Architecture》2009,55(1):70-78

In this paper, we propose an efficient multipath multicast routing algorithm in wormhole-routed 2D torus networks. We first introduce a hamiltonian cycle model for exploiting the feature of torus networks. Based on this model, we find a hamiltonian cycle in torus networks. Then, an efficient multipath multicast routing algorithm with hamiltonian cycle model (mulitpath-HCM) is presented. The proposed multipath multicast routing algorithm utilizes communication channels more uniformly in order to reduce the path length of the routing messages, making the multicasting more efficient. Simulation results show that the multicast latency of the proposed multipath-HCM routing algorithm is superior to that of fixed and dual-path routing algorithms. 相似文献

7.

All-to-all personalized communication in a wormhole-routed torus

Yu-Chee Tseng Gupta S.K.S. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(5):498-505

All-to-all personalized communication, or complete exchange, is at the heart of numerous applications in parallel computing. It is one of the most dense communication patterns. In this paper, we consider this problem in a torus of any dimension with the wormhole-routing capability. We propose complete exchange algorithms that use optimal numbers of phases (if each side of the tori is a multiple of eight) or asymptotically optimal numbers of phases (otherwise). Interestingly, in order to achieve this, we only make weak assumptions-that a node is capable of sending and receiving at most one message at a time, and the network is capable of supporting the dimension-ordered (or e-cube) minimum routing 相似文献

8.

An extended dominating node approach to broadcast and globalcombine in multiport wormhole-routed mesh networks

Yih-Jia Tsai McKinley P.K. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(1):41-58

A new approach to the design of collective communication operations in wormhole-routed mesh networks is described. The approach extends the concept of dominating sets in graph theory by accounting for the relative distance-insensitivity of the wormhole switching strategy and by taking advantage of a multiport communication architecture, which allows each node to simultaneously transmit messages on different outgoing channels. Collective communication operations are defined in terms of sets of extended dominating nodes (EDNs). The nodes in a set of EDNs can deliver (receive) messages to (from) a different, larger set of nodes in a single message-passing step under dimension-ordered wormhole routing and without channel contention among messages. The EDN model can be applied to different collective operations in 2D and 3D mesh networks. The authors focus on EDN-based broadcast and global combine operations. Performance evaluation results are presented that confirm the advantage of this approach over other methods 相似文献

9.

A general theory for deadlock avoidance in wormhole-routed networks

Fleury E. Fraigniaud P. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(7):626-638

Most machines of the last generation of distributed memory parallel computers possess specific routers which are used to exchange messages between nonneighboring nodes in the network. Among the several technologies, wormhole routing is usually preferred because it allows low channel-setup time and reduces the dependency between latency and internode distance. However, wormhole routing is very susceptible to deadlock because messages are allowed to hold many resources while requesting others. Therefore, designing deadlock-free routing algorithms using few hardware facilities is a major problem for wormhole-routed networks. In this paper, we describe a general theoretical framework for the study of deadlock-free routing functions. We give a general definition of what can be a routing function. This definition captures many specific definitions of the literature (e.g., vertex dependent, input-dependent, source-dependent, path-dependent etc.). Using our definition, we give a necessary and sufficient condition which characterizes deadlock-free routing functions. Our theory embraces, at a high level, most of the theories related to deadlock avoidance in wormhole-routed networks previously derived in the literature. In particular, it applies not only to one-to-one routing, but also to one-to-many routing. The latter paradigm is used to solve the multicast problem with the path-based or tree-based facility 相似文献

10.

Barrier synchronization on wormhole-routed networks

Yuzhong Sun Cheung P.Y.S. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(6):583-597

In this paper, we propose an efficient barrier synchronization scheme on networks with arbitrary topologies. We first present a distributed method in building a barrier routing tree. The barrier messages can be delivered adaptively according to the hierarchy of the established barrier tree to void congestion and faulty nodes in the network. We then propose a new technique, called bandwidth-preempting technique, for a blocked barrier message to preempt a channel occupied by a data message so that the latency of a barrier message can be controlled without affecting much of the overall system performance. We also propose an analytical performance model and present simulation results for the performance evaluation of the proposed scheme. Performance evaluations show that the proposed scheme outperforms the existing algorithms for barrier synchronization 相似文献

11.

An efficient broadcast scheduling algorithm for TDMA ad-hoc networks

《Computers & Operations Research》2002,29(13):1793-1806

In this paper, we propose an efficient algorithm to find a collision-free time slot schedule in a time division multiple access frame. In order to minimize the system delay, the optimal schedule must be defined as the one that has the minimum frame length and provides the maximum slot utilization. The proposed algorithm is based on the sequential vertex coloring algorithm. Numerical examples and comparisons with the algorithm in previous research have shown that the proposed algorithm can find near-optimal solutions in respect of the system delay.Scope and purposeAn ad-hoc network was introduced in order to apply packet switching communication to a shared radio channel. Using a radio channel as the broadcast medium to interconnect users, an ad-hoc network provides flexible data communication services among a large number of geographically distributed, possibly mobile, radio units. In an ad-hoc network, since all users share a single channel by multiple access protocol, unconstrained transmission may lead to the time overlap of two or more packet receptions, called collision, resulting in damaged useless packets at the destination. Collided packets increase the system delay because they must be retransmitted. Therefore, the transmission for each station must be scheduled to avoid any collision, that is, collision-free transmission should be guaranteed. The time division multiple access (TDMA) technology can be used to schedule collision-free transmission. In this paper, we propose an efficient algorithm to find a collision-free time slot schedule in a TDMA ad-hoc network. 相似文献

12.

An analytical model of broadcast in QoS-aware wormhole-routed NoCs

Mahmoud Moadeli^{Author Vitae} Wim Vanderbauwhede Author Vitae 《Journal of Systems and Software》2011,84(1):12-20

Networks-on-Chip (NoC) emerged to address the technological and design issues related to development of large systems-on-chip (SoCs). Due to diversity of the application's performance requirements, most NoC architectures offer supports for quality of service (QoS). Also, to utilize the available bandwidth efficiently, they might implement mechanisms for delivering collective communication operations. This paper presents an analytical model to predict the average latency of wormhole-routed prioritized broadcast communication in NoCs. The model assumes that the network uses all-port routers scheme and offers differentiated services-based QoS. To verify the analysis, the model predictions are compared against the results obtained from a discrete-event simulator developed using OMNET++. 相似文献

13.

Unicast-based multicast communication in wormhole-routed networks

McKinley P.K. Xu H. Esfahanian A.-H. Ni L.M. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(12):1252-1265

Multicast communication, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is being increasingly demanded in parallel computing. System supported multicast services can potentially offer improved performance, increased functionality, and simplified programming, and may in turn be used to support various higher-level operations for data movement and global process control. This paper presents efficient algorithms to implement multicast communication in wormhole-routed direct networks, in the absence of hardware multicast support, by exploiting the properties of the switching technology. Minimum-time multicast algorithms are presented for n-dimensional meshes and hypercubes that use deterministic, dimension-ordered routing of unicast messages. Both algorithms can deliver a multicast message to m-1 destinations in [log ₂ m] message passing steps, while avoiding contention among the constituent unicast messages. Performance results of implementations on a 64-node nCUBE-2 hypercube and a 168-node Symult 2010 2-D mesh are given 相似文献

14.

A trip-based multicasting model in wormhole-routed networks withvirtual channels

Yu-Chee Tseng Panda D.K. Ten-Hwang Lai 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(2):138-150

This paper focuses on efficient multicasting in wormhole-routed networks. A trip-based model is proposed to support adaptive, distributed, and deadlock-free multiple multicast on any network with arbitrary topology using at most two virtual channels per physical channel. This model significantly generalizes the path-based model proposed earlier which works only for Hamiltonian networks and cannot be applicable to networks with arbitrary topology resulted due to system faults. Fundamentals of the trip-based model, including the necessary and sufficient condition to be deadlock-free, and the use of appropriate number of virtual channels to avoid deadlock are investigated. The potential of this model is illustrated by applying it to hypercubes with faulty nodes. Simulation results indicate that the proposed model can implement multiple multicast on faulty hypercubes with negligible performance degradation 相似文献

15.

Pseudo-cycle-based multicast routing in wormhole-routed networks

下载免费PDF全文

宋建平侯紫峰许铭《计算机科学技术学报》2003,18(6):0-0

This paper addresses the problem of fault-tolerant multicast routing in wormholerouted multicomputers.A new pseudo-cycle-based routing method is presented for constructing deadlock-free multicast routing algorithms.With at most two virtual channels this technique can be applied to any connected networks with arbitrary topologies.Simulation results show that this technique results in negligible performance degradation even in the presence of a large number of faulty nodes. 相似文献

16.

The message flow model for routing in wormhole-routed networks

Lin X. McKinley P.K. Ni L.M. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(7):755-760

In this paper, we introduce a new approach to deadlock-free routing in wormhole-routed networks called the message flow model. This method may be used to develop deterministic, partially-adaptive, and fully-adaptive routing algorithms for wormhole-routed networks with arbitrary topologies. We first establish the necessary and sufficient condition for deadlock free routing, based on the analysis of the message flow on each channel. We then use the model to develop new adaptive routing algorithms for 2D meshes 相似文献

17.

Optimal software multicast in wormhole-routed multistage networks

Hong Xu Yadong Gui Ni L.M. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(6):597-607

Multistage interconnection networks are a popular class of interconnection architecture for constructing scalable parallel computers (SPCs). The focus of this paper is on the multistage network system which supports wormhole routed turnaround routing. Existing machines characterized by such a system model include the IBM SP-1 and SP-2, TMC CM-5, and Meiko CS-2. Efficient collective communication among processor nodes is critical to the performance of SPCs. A system-level multicast service, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is fundamental in supporting collective communication primitives including the application-level broadcast, reduction, and barrier synchronization. This paper addresses how to efficiently implement multicast services in wormhole-routed multistage networks, in the absence of hardware multicast support, by exploiting the properties of the turnaround switching technology. An optimal multicast algorithm is proposed. The results of implementations on a 64-node SP-1 show that the proposed algorithm significantly outperforms the application-level broadcast primitives provided by currently existing collective communication libraries including the public domain MPI 相似文献

18.

Efficient path-based multicast in wormhole-routed mesh networks

《Journal of Systems Architecture》2000,46(10):919-930

The capability of multidestination wormhole allows a message to be propagated along any valid path in a wormhole-routed network conforming to the underlying base routing scheme. The multicast on the path-based routing model is highly dependent on the spatial locality of destinations participating in multicasting. In this paper, we propose two proximity grouping schemes for efficient multicast in wormhole-routed mesh networks with multidestination capability by exploiting the spatial locality of the destination set. The first grouping scheme, graph-based proximity grouping, is proposed to group the destinations together with locality to construct several disjoint sub-meshes. This is achieved by modeling the proximity grouping problem to graph partitioning problem. The second one, pattern-based proximity grouping, is proposed by the pattern classification schemes to achieve the goal of the proximity grouping. By simulation results, we show the routing performance gains over the traditional Hamiltonian-path routing scheme. 相似文献

19.

A hybrid genetic algorithm for the minimum energy broadcast problem in wireless ad hoc networks

Alok Singh Wilson Naik Bhukya 《Applied Soft Computing》2011,11(1):667-674

Given a wireless ad hoc network with a specified source node that has to broadcast messages to all other nodes in the network, the minimum energy broadcast (MEB) problem seeks a broadcast scheme for this network with minimum energy consumption. The MEB problem is NP-Hard. This paper describes a hybrid approach to the MEB problem combining a genetic algorithm with a local search heuristic. We have compared our hybrid approach against the best heuristic approaches known for this problem. Our approach outperformed all these approaches and emerged as the best. 相似文献

20.

An adaptive global reduction algorithm for wormhole-routed 2D meshes

《Parallel Computing》1997,23(13):1909-1936

This paper presents a global reduction algorithm for wormhole-routed 2D meshes. Well-known reduction algorithms that are optimized for short vectors have complexity O(M log N), where N = n × n is the number of nodes, and M the vector length. Algorithms suitable for long vectors have complexity O(√N + M). Previously known asymptotically optimal algorithms with complexity O(log N + M) incur inherent network contention among constituent messages. The proposed algorithm adapts to the given vector length, resulting in complexities O(M log N) for short vectors, O(log N + M) for medium-sized vectors, and O(√N + M) for sufficiently long vectors. The O(√N + M) version is preferred to the O(log N + M) version for long vectors, due to its small coefficient associated with M, the dominating factor for such vectors. The algorithm is contention-free in a synchronous environment. Under asynchronous execution models, depth contention (contention among message-passing steps) may occur. However, simulation studies show that the effect of depth contention on the actual performance is negligible. 相似文献