首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Three-dimensional packaging technologies are critical for enabling ultra-compact, massively parallel processors (MPPs) for embedded applications. Through-water optical interconnect has been proposed as a useful technology for building ultra-compact MPPs since it provides a simplified mechanism for interconnecting stacked multichip substrates. This paper presents the offset cube, a new network topology designed to exploit the packaging benefits of through-wafer optical interconnect in ultra-compact MPP systems. We validate the offset cube's topological efficiency by developing deadlock-free adaptive routing protocols with modest virtual channel requirements (only two virtual channels per link needed for full adaptivity). A preliminary analysis of router complexity suggests these protocols can be efficiently implemented in hardware. We also present a 3D mesh embedding for the offset cube. Network simulations show the offset cube performs comparably to a bidirectional 3D mesh of equal size under uniform, hot-spot, and trace-driven traffic loads. While the offset cube is not proposed as a general replacement for the mesh topology it leverages the benefits of through-wafer optical interconnect more effectively than a mesh by completely eliminating chip-to-chip wires for data signals. Hence, the offset cube is an effective topology for interconnecting ultra-compact MCM-level MPP systems  相似文献   

2.
In this paper, we address the global problem of designing reliable wavelength division multiplexing (WDM) networks including the traffic grooming. This global problem consists in finding the number of optical fibers between each pair of optical nodes, finding the configuration of each node with respect to transponders, finding the virtual topology (i.e., the set of lightpaths), routing the lightpaths, grooming the traffic (i.e, grouping the connections and routing them over the lightpaths) and, finally, assigning wavelengths to the lightpaths. Instead of partitioning the problem into subproblems and solving them successively, we propose a mathematical programming model that addresses it as a whole. Numerical results are presented and analyzed.  相似文献   

3.
The cylindrical banyan network is a variation of the classical banyan network in two ways: (1) each node is a processor with a switch, and (2) every pair of nodes at the two ends is merged. We present a routing algorithm for the cylindrical banyan network, and show it is optimal in terms of the path length. We also show that for the cylindrical banyan network containing L levels with 2L processors at each level (i.e., containing L2L processors in total), the worst case path length between any two nodes is 1.5L. We discuss a generalization of the cylindrical banyan network and an optimal routing algorithm for it. Our routing algorithms are distributed, and so can be executed locally at each processor.  相似文献   

4.
In a partitioned optical passive stars (POPS) network, n=dg processors are divided into g groups of d processors each, and such a POPS network is denoted by POPS(d,g). There is an optical passive star (OPS) coupler between every pair of groups. Hence, a POPS(d,g) requires g/sup 2/ couplers. It is likely that, in a practical system, the number of couplers will be less than the number of processors, i.e., d>/spl radic/n>g and the number of groups will be smaller than the number of processors in a group. Hence, it is important to design fast algorithms for basic operations on such POPS networks with large group size. We present fast algorithms for data sum, prefix sum, and permutation routing on a POPS(d,g) such that d>/spl radic/n>g. Our data sum and prefix sum algorithms improve upon the best known algorithms for these problems designed by Sahni (2000). Permutation routing can be solved on a POPS network by simulating a hypercube sorting algorithm. Our algorithm for permutation routing is more efficient compared to this simulated hypercube sorting algorithm.  相似文献   

5.
Array Processors with Pipelined Buses (APPBs) are hybrid optical-electronic multiprocessor architectures in which message-pipelined optical buses are used for interprocessor communications. Presented in this paper is a structural variation of the basic APPB which utilizes optical switches to provide the capability of switching messages between buses without their being relayed by intermediate processors. Such switching capability eliminates the optical-electronic-optical signal conversion due to message relays and offers improved communication efficiency. We discuss routing issues, evaluate bandwidth improvement, and present efficient communications including matrix transpose, binary tree routing, and perfect shuffle which take advantage of the switching capability.  相似文献   

6.
This paper presents a general virtual ring method to design and analyze small-world structured P2P networks on the base topologies embedded in ID spaces with distance metric. Its basic idea is to abstract a virtual ring from the base topology according to the distance metric, then build small-world long links in the virtual ring and map the links back onto the real network to construct the small-world routing tables for achieving logarithmic greedy routing efficiency. Four properties are proposed to characterize the base topologies that can be turned into small-world by the virtual ring method. The virtual ring method is applied to the base topologies of d-torus with Manhattan distance, high dimensional d-torus base topologies, and other base topologies including the unbalanced d-torus and the ring topology with tree distance. Theoretical analysis and simulation experiments demonstrate the efficiency and the resilience of the proposed overlays.  相似文献   

7.
星形图上无死锁的路径算法   总被引:4,自引:0,他引:4  
星形图具有许多良好的拓扑性质,是一种有可能替代传统的超立方体的并行计算互联网络的模型。在本文中,作者针对在星形图这样一种高度规则的网络中,可能产生死锁的问题,对星形图上无死锁的路径算法进行了研究。首先利用星形图中匹配基的性质,给出了从Sn(B)到Sk的正规映射的定义,然后提出了星形图上的两个无死锁受限条件,最后证明了一个满足无死锁受限条件的路径算法。作者还提出了星形图上路径算法的最小无死锁受限条件  相似文献   

8.
The selection of a topology is essential to the performance of interconnection networks, so designing a new, cost-effective topology is very significant. 2D mesh is one of the most popular topologies. However, the diameter and average distance of a 2D mesh are large enough to greatly influence the performance of the network. This paper presents a novel topology called TM, which combines the advantages of both a 2D torus and a 2D mesh. For an n×n network, the total number of links in a TM is the same as that in a mesh, while the diameter of a TM is extremely close to that of a torus. Besides, the average distance of a TM is at the middle of that of a torus and that of a mesh. To prevent deadlocks in TMs, a virtual network partitioning scheme is adopted into the TM network. Moreover, both of the deterministic and fully-adaptive routing techniques in TMs are proposed in this paper. Compared to mesh, the TM network provides average distance and diameter reduction, which contributes to the performance enhancement. Sufficient simulation results are presented to show the effectiveness of the TM network, and the new routing schemes proposed for it, by comparing with the mesh network. Compared to the torus, which requires at least 3 virtual channels to support fully-adaptive routing, the TM network can support fully-adaptive routing with only 2 virtual channels. Seen from the experimental results, in most cases, the performance of TM is worse than the torus, while in some cases, the performance of TM is comparable to torus or even better than the torus.  相似文献   

9.
We have recently introduced the Extended OTIS-n-Cube to overcome the weakness of some limitations found in the well-known OTIS-n-Cube such as the degree and the diameter. This paper investigates the topological properties of the new interconnection network by proposing an extensive study on some attractive topological properties of the extended OTIS-n-Cube interconnection network. Inspired by the attractive features of the new network, such as regular degree, small diameter, and semantic structure, we present a theoretical study on some topological properties of the Extended OTIS-n-Cube including routing paths and embedded cycles. Furthermore, the paper presents a performance evaluation on the topology by comparing it with the OTIS-n-Cube. Results prove the superiority of the new topology especially in minimizing routing distances.  相似文献   

10.
The paper presents novel embeddings of various classical topologies into the OPAM multicomputer. OPAM consists of a large number of processors that are connected by a two level, crossbar based interconnection network. The network combines a large, optical circuit-switched crossbar (reconfigurable network), with many small, packet-switching crossbars. The necessary embedding is very different than classical approaches. The goal in our case is to minimize routing decisions, so that communication requests can be satisfied by passing through two small crossbars. We show how to map parallel programs to this architecture using graph contraction notations. The family of parallel programs that we consider consists of multiple processes and communication links that are represented by connected, regular graphs such as rings, trees, two dimensional grids, cube connected cycles and hypercubes. In each case we show how to partition the vertex set of the program's graph to subsets, and how to assign each subset a cluster of processors in order to realize the topology of the given problem. In some of the cases we also prove that our partition and assignment algorithms are optimal  相似文献   

11.
Energy consumption in Wireless Sensor Networks (WSNs) is of paramount importance, which is demonstrated by the large number of algorithms, techniques, and protocols that have been developed to save energy, and thereby extend the lifetime of the network. However, in the context of WSNs routing and dissemination, Connected Dominating Set (CDS) principle has emerged as the most popular method for energy-efficient topology control (TC) in WSNs. In a CDS-based topology control technique, a virtual backbone is formed, which allows communication between any arbitrary pair of nodes in the network. In this paper, we present a CDS based topology control algorithm, A1, which forms an energy efficient virtual backbone. In our simulations, we compare the performance of A1 with three prominent CDS-based algorithms namely energy-efficient CDS (EECDS), CDS Rule K and A3. The results demonstrate that A1 performs better in terms of message overhead and other selected metrics. Moreover, the A1 not only achieves better connectivity under topology maintenance but also provides better sensing coverage when compared with other algorithms.  相似文献   

12.
We have designed and implemented a light‐weight process (thread) library called ‘Lesser Bear’ for SMP computers. Lesser Bear has high portability and thread‐level parallelism. Creating UNIX processes as virtual processors and a memory‐mapped file as a huge shared‐memory space enables Lesser Bear to execute threads in parallel. Lesser Bear requires exclusive operation between peer virtual processors, and treats a shared‐memory space as a critical section for synchronization of threads. Therefore, thread functions of the previous Lesser Bear are serialized. In this paper, we present a scheduling mechanism to execute thread functions in parallel. In the design of the proposed mechanism, we divide the entire shared‐memory space into partial spaces for virtual processors, and prepare two queues (Protect Queue and Waiver Queue) for each partial space. We adopt an algorithm in which lock operations are not necessary for enqueueing. This algorithm allows us to propose a scheduling mechanism that can reduce the scheduling overhead. The mechanism is applied to Lesser Bear and evaluated by experimental results. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

13.
介绍了一种面向机群系统双环形网络拓扑结构的高速光互联网络适配器的设计和实现方法。该网络适配器基于FPGA技术实现,总线接口采用高速、高带宽的DDR DIMM总线,网络传输介质采用光纤,底层路由协议采用FPGA内部硬件逻辑实现,全方位保证了高带宽、低延迟、高可靠的网络特性。  相似文献   

14.
With traditional event-list techniques, evaluating a detailed discrete event simulation-model can often require hours or even days of computation time. By eliminating the event list and maintaining only sufficient synchronization to ensure causality, parallel simulation can potentially provide speedups that are linear in the numbers of processors. A set of shared-memory experiments using the Chandy-Misra distributed simulation algorithm, to simulate networks of queues is presented. Parameters of the study include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queuing network models.<>  相似文献   

15.
This paper presents fault-tolerant protocols for fast packet switch networks withconvergence routing. The objective is to provide fast reconfiguration and continuous host-to-host communication after a link or a node (switch) failure,Convergence routingcan be viewed as a variant ofdeflection routing,which combines, in a dynamic fashion, the on-line routing decision with the traffic load inside the network. Unlike other deflection techniques, convergence routing operates withglobal sense of directionand guarantees that packets will reach or converge to their destinations. Global sense of direction is achieved by embedding of virtual rings to obtain a linear ordering of the nodes. We consider virtual ring embeddings over (i) a single spanning tree, and (ii) over two edge-disjoint spanning trees. Thus, the fault-tolerant solution is based on spanning trees and designed for a switch-based (i.e., arbitrary topology) architecture called MetaNet. In this work, the original MetaNet's convergence routing scheme has been modified in order to facilitate the property that the packet header need not be recomputed after a failure and/or a reconfiguration. This is achieved by having, at the network interface, a translator that maps the unique destination address to a virtual address. It is argued that virtual rings embedded over two-edge disjoint spanning trees increase the fault tolerance for both node and link faults and provides continuous host-to-host communication.  相似文献   

16.
The work performed by a parallel algorithm is the product of its running time and the number of processors it requires. This paper presents work-efficient (or cost-optimal) routing algorithms to determine the switch settings for realizing permutations on rearrangeable symmetrical networks such as Benes and the reduced Ω NΩN-1. These networks have 2n-1 stages with N=2n inputs/outputs, each stage consisting of N/2 crossbar switches of size (2×2). Previously known parallel routing algorithms for a rearrangeable network with N inputs determine the states of all switches recursively in O(n) iterations using N processors. Each iteration determines the switch settings of at most two stages of the network and requires at least O(n) time on a computer of N processors, regardless of the type of its interconnection network. Hence, the work of any previously known parallel routing algorithm equals at least O(Nn2) for setting up all the switches of a rearrangeable network. The new routing algorithms run on a computer of p processors, 1⩽p⩽N/n, and perform work O(Nn). Moreover, because the range of p is large, the new routing algorithms do not have to be changed in case some processors become faulty  相似文献   

17.
The hypercube multiprocessor is a popular architecture in parallel computing environments. Recently, computer security and privacy issues have gained significance. This paper considers the security issues of a network of processors connected over a hypercube topology. We demonstrate that a covert channel can be established by exploiting the underlying message communication mechanism of the hypercube, even when the access-control denies such communication. This can occur because node-to-node communication in a hypercube may require multiple hops and two or more disjoint message communications may actually be transmitted along common links. Congestion (and the resulting delay) in such shared links can provide the basis for a covert channel. We introduce security considerations for a multiprocessor by focussing on the covert channel issue in hypercube message communication. A security model for the hypercube routing function is presented. Based on noninterference, we develop sufficient conditions for the routing mechanism to be free of covert channels. Two secure hypercube message routing approaches are proposed for store-and-forward communication strategy. The first approach (Virtual Channel) achieves security by fixed bandwidth partitioning of links, for which the price is paid in delay performance. The second approach (Bypass) prioritizes lower security class messages, for which delay of higher class messages is sacrificed. Performance (i.e., cost of security) of these two approaches are shown using simulation. Finally, a time-out feature is introduced to the Bypass approach, which disallows potential starvation of higher class messages at the expense of limited bandwidth covert channel. Maximum covert channel bandwidth (in terms of the time-out parameter) is analyzed.  相似文献   

18.
Mapping and load-balancing iterative computations   总被引:1,自引:0,他引:1  
We consider the mapping of iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The aim is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link- sharing, and data distribution schemes.  相似文献   

19.
虚网叠加构造自适应路由算法的有效框架   总被引:2,自引:0,他引:2  
大规模并行处理机系统中路由算法对互联网络通信性能和系统性起着重要作用。  相似文献   

20.
This paper proposes a new survivable algorithm named sub-path protection based on auxiliary virtual topology (SPAVT) to tolerate the single-link failure in WDM optical networks. First, according to the protection-switching time constraint, SPAVT searches multiple pairs of primary and backup paths for each node pair in the network by the off-line manner, and then map these paths to the virtual topology. When a connection request arrives, SPAVT only needs to run one time of the Dijkstra’s algorithm to search a virtual route in virtual topology, where the route may consist of multiple pairs of sub-paths, to meet the protection-switching time constraint. Then, according to the shared resources policy, SPAVT chooses an optimal pair of sub-paths. Simulation results show that SPAVT has smaller blocking probability and lower time complexity than conventional algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号