共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
张国珍 《计算机工程与应用》2015,51(20):1-4
单向[k]-元[n]-立方体是指具有单向边的[k]-元[n]-立方体互连网络拓扑。当网络包含的顶点数目较大时,比起传统的双向[k]-元[n]-立方体,单向?[k]-元[n]-立方体对通信硬件复杂性的要求更低一些。提出了[k]-元[n]-立方体的一个定向,使得定向后的单向[k]-元[n]-立方体[UQkn]有一些良好的性质。证明了[UQkn]是正则的,极大弧连通的,具有迭代结构的且[UQkn]的直径是小的。此外,提出了一个简单的多项式时间路由算法。 相似文献
3.
《Journal of Systems Architecture》2004,50(11):697-705
This paper derives a number of results related to the topological properties of OTIS k-ary n-cube interconnection networks. The basic topological metrics of size, degree, shortest distance, and diameter are obtained. Then results related to embedding in OTIS k-ary n-cubes of OTIS k-ary (n−1)-cubes, cycles, meshes, cubes, and spanning trees are derived. The OTIS k-ary n-cube is shown to be Hamiltonian. Minimal one-to-one routing and optimal broadcasting algorithms are proposed. The OTIS k-ary n-cube is shown to be maximally fault-tolerant. These results are derived based on known properties of k-ary n-cube networks and general properties of OTIS networks. 相似文献
4.
张国珍 《计算机工程与应用》2013,(22):3-6
[k]元[n]方体[Qkn]是设计大规模多处理机系统时最常用的互连网络拓扑结构之一。对于[1≤m≤n-1],设[F]是[Qkn]中的一个由非空点集[VF]和非空边集[EF]构成的故障集,满足[Qkn-F]中不存在[Qkn-m]且[VF]破坏的[Qkn-m]的集合与[EF]破坏的[Qkn-m]的集合互不包含。设[f*(n,m)]是破坏[Qkn]中的所有子立方[Qkn-m]所需要的故障集[F]的最小基数。证明了对于奇数[k≥3],[fk(n,1)]为[k+1],[fk(n,n-1)]为[kn-1-1+n],[f*(n,m)]的上下界分别为[Cm-1n-1km+Cm-1n-2km-1]和[km]。举例说明了上界[Cm-1n-1km+Cm-1n-2km-1]是最优的。 相似文献
5.
并行计算机系统功能的实现很大程度上依赖于系统互连网络的性能。为了精确度量以k元n方体为底层拓扑结构的并行计算机系统的容错能力,研究了点故障模型下k元n方体中k元(n-1)方体子网络的可靠性。当k ≥ 3且为奇数时,分别在固定划分模式和灵活划分模式下对k元n方体中不同数目的k元(n-1)方体子网络保持无故障状态的平均失效时间进行了分析,并得出了这一子网络可靠性评估参数的计算公式。结果表明,当基于k为奇数的k元n方体构建的并行计算机系统指派子网络执行用户任务时,在点故障模型下灵活划分模式相比固定划分模式有着更好的容错能力。 相似文献
6.
7.
《Parallel and Distributed Systems, IEEE Transactions on》1999,10(9):904-921
A spate of deadlock avoidance-based and deadlock recovery-based routing algorithms have been proposed in recent years without full understanding of the likelihood and characteristics of actual deadlocks in interconnection networks. This work models the interrelationships between routing freedom, message blocking, correlated resource dependencies, and deadlock formation. It is empirically shown that increasing routing freedom, as achieved by allowing unrestricted routing over multiple physical and virtual channels, reduces the probability of deadlocks and the likelihood of other types of correlated message blocking that can degrade performance. Moreover, when true fully adaptive routing is used in k-ary n-cube networks with two or more virtual channels (wormhole OF virtual cut-through switched), it is empirically shown that deadlocks are virtually eliminated in networks with n⩾2. These results indicate that deadlocks are very infrequent when the network and routing algorithm inherently provide sufficient routing freedom, thus increasing the viability of deadlock recovery routing strategies 相似文献
8.
In a pipelined-channel interconnection network, multiple bits may be simultaneously in flight on a single wire. This allows the cycle time of the network to be independent of the wire lengths, significantly affecting the network design trade-offs. This paper investigates the design and performance of pipelined channel k-ary n-cube networks, with particular emphasis on the choice of dimensionality and radix. Networks are investigated under the constant link width, constant node size and constant bisection constraints. We find that the optimal dimensionality of pipelined-channel networks is higher than that of nonpipelined-channel networks, with the difference being greater under looser wiring constraints. Their radix should remain roughly constant as network size is grown, decreasing slightly for some unidirectional tori and increasing slightly for some bidirectional meshes. Pipelined-channel networks are shown to provide lower latency and higher bandwidth than their nonpipelined-channel counterparts, especially for high-dimensional networks. The paper also investigates the effects of switching overhead and message lengths, indicating where results agree with and differ from previous results obtained for nonpipelined-channel networks 相似文献
9.
《Performance Evaluation》2006,63(4-5):423-440
Several analytical models of fully adaptive routing in wormhole-routed networks have recently been reported in the literature. All these models, however, have been discussed for routing algorithms with deadlock avoidance. Recent studies have revealed that deadlocks are quite rare in the network, especially when enough routing freedom is provided. Thus, the hardware resources, e.g. virtual channels, dedicated for deadlock avoidance are not utilised most of the time. This consideration has motivated researchers to introduce fully adaptive routing algorithms with deadlock-recovery. This paper proposes a new analytical model to predict message latency in k-ary n-cubes with compressionless routing, a fully adaptive algorithm that uses deadlock-recovery. The proposed model uses results from queueing systems with impatient customers to capture the effects of the timeout mechanism used in this routing algorithm to deal with message deadlock. The validity of the model is demonstrated by comparing results predicted by the analytical model against those obtained through simulation experiments. 相似文献
10.
Dhabaleswar K. Panda 《Future Generation Computer Systems》1995,11(6):585-602
This paper presents a new approach to implement fast barrier synchronization in wormhole k-ary n-cubes. The novelty lies in using multidestination messages instead of the traditional single destination messages. Two different multidestination worm types, gather and broadcasting, are introduced to implement the report and wake-up phases of barrier synchronization, respectively. Algorithms for complete and arbitrary set barrier synchronization are presented using these new worms. It is shown that complete barrier synchronization in a k-ary n-cube system with e-cube routing can be implemented with 2n communication start-ups as compared to 2nlog2k start-ups needed with unicast-based message passing. This leads to an asymptotic improvement by a factor of log2k. Simulation results for different system and architectural parameters indicate that the new framework can reduce barrier synchronization cost considerably compared to the unicast-based scheme. For arbitrary set barrier, an interesting trend is observed where the synchronization cost keeps on reducing beyond a certain number of participating nodes. The framework demonstrates potential for supporting fast barrier synchronization in large wormhole-routed systems. 相似文献
11.
《Journal of Parallel and Distributed Computing》2004,64(2):183-190
Incomplete or pruned k-ary n-cube, n⩾3, is derived as follows. All links of dimension n−1 are left in place and links of the remaining n−1 dimensions are removed, except for one, which is chosen periodically from the remaining dimensions along the intact dimension n−1. This leads to a node degree of 4 instead of the original 2n and results in regular networks that are Cayley graphs, provided that n−1 divides k. For , the preceding restriction is not problematic, as it only requires that k be even (a multiple of 4). In other cases, changes to the basis network to be pruned, or to the pruning algorithm, can mitigate the problem. Incomplete k-ary n-cube maintains a number of desirable topological properties of its unpruned counterpart despite having fewer links. It is maximally connected, has diameter and fault diameter very close to those of k-ary n-cube, and an average internode distance that is only slightly greater. Hence, the cost/performance tradeoffs offered by our pruning scheme can in fact lead to useful, and practically realizable, parallel architectures. We study pruned k-ary n-cubes in general and offer some additional results for the special case n=3. 相似文献
12.
Thottethodi M. Lebeck A.R. Mukherjee S.S. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(3):257-272
Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point by reducing the load on the network. Congestion control via source throttling-a common technique to reduce the network load-prevents new packets from entering the network in the presence of congestion. Unfortunately, prior schemes to implement source throttling either lack vital global information about the network to make the correct decision (whether to throttle or not) or depend on specific network parameters, or communication patterns. This paper presents a global-knowledge-based, self-tuned, congestion control technique that prevents saturation at high loads across different communication patterns for k-ary n-cube networks. Our design is composed of two key components. First, we use global information about a network to obtain a timely estimate of network congestion. We compare this estimate to a threshold value to determine when to throttle packet injection. The second component is a self-tuning mechanism that automatically determines appropriate threshold values based on throughput feedback. A combination of these two techniques provides high performance under heavy load, does not penalize performance under light load, and gracefully adapts to changes in communication patterns. 相似文献
13.
This work presents Immucube, a scalable and efficient mechanism to improve dependability of interconnection networks for parallel and distributed computers. Immucube achieves better flexibility and scalability than any other previous fault-tolerant mechanism in k-ary n-cubes. The proposal inherits from Immunet several advantages over other previous fault-tolerant routing algorithms: 1) allowing any temporal and spatial fault combination, 2) permitting automatic and application-transparent reconfiguration after any fault, and 3) requiring a negligible overhead in the absence of faults. Immucube introduces new important features, such as: 4) providing graceful performance degradation, even in very large interconnection networks, 5) tolerating transparent resource utilization after transitory faults or partial repair of faulty resources, 6) being able to deal with intermittent faults, and 7) being able to dynamically recover the original network performance when all the failed components have been repaired 相似文献
14.
The evaluation of advanced routing features must be based on both of costs and benefits. To date, adaptive routers have generally been evaluated on the basis of the achieved network throughput (channel utilization), ignoring the effects of implementation complexity. In this paper, we describe a parameterized cost model for router performance, characterized by two numbers: router delay and flow control time. Grounding the cost model in a 0.8 micron gate array technology, we use it to compare a number of proposed routing algorithms. From these design studies, several insights into the implementation complexity of adaptive routers are clear. First, header update and selection is expensive in adaptive routers, suggesting that absolute addressing should be reconsidered. Second, virtual channels are expensive in terms of latency and cycle time, so decisions to include them to support adaptivity or even virtual lanes should not be taken lightly. Third, requirements of larger crossbars and more complex arbitration cause some increase in the complexity of adaptive routers, but the rate of increase is small. Last, the complexity of adaptive routers significantly increases their setup delay and flow control cycle times, implying that claims of performance advantages in channel utilization and low load latency must be carefully balanced against losses in achievable implementation speed 相似文献
15.
从网络的拓扑、路由器、通道3方面分析了k元n方体互联网络的体系结构特征,建立了网络性能模型,并讨论了网络体系结构,应用程序和运行环境对网络性能的影响,以及网络性能的改进措施。 相似文献
16.
Panda D.K. Singal S. Kesavan R. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(1):76-96
This paper proposes multidestination message passing on wormhole k-ary n-cube networks using a new base-routing-conformed-path (BRCP) model. This model allows both unicast (single-destination) and multidestination messages to co-exist in a given network without leading to deadlock. The model is illustrated with several common routing schemes (deterministic, as well as adaptive), and the associated deadlock-freedom properties are analyzed. Using this model, a set of new algorithms for popular collective communication operations, broadcast and multicast, are proposed and evaluated. It is shown that the proposed algorithms can considerably reduce the latency of these operations compared to the Umesh (unicast-based multicast) and the Hamiltonian path-based schemes. A very interesting result that is presented shows that a multicast can be implemented with reduced or near-constant latency as the number of processors participating in the multicast increases beyond a certain number. It is also shown that the BRCP model can take advantage of adaptivity in routing schemes to further reduce the latency of these operations. The multidestination mechanism and the BRCP model establish a new foundation to provide fast and scalable collective communication support on wormhole-routed systems 相似文献
17.
为了度量以3元n立方网络为底层拓扑结构的并行与分布式系统的连通性,通过构造其2阶超割的方法,计算出当n不小于2时,3元n立方网络的2阶超连通度是6n-7。证明了对于以3元n立方网络为底层拓扑结构的并行与分布式计算机系统,当有不超过6n-8个节点发生故障且每个连通分支至少还有3个健康的节点时,该并行与分布式系统的任意两个节点之间仍然有一条无故障的通信线路。 相似文献
18.
《Information Processing Letters》2014,114(9):486-491
19.
可诊断度是评估多处理器系统可靠性的一个关键指标.t/k诊断策略通过允许至多k个无故障处理器被误诊为故障处理器,从而极大提高了系统的可诊断度.与t可诊断度和t1/t1可诊断度相比,t/k可诊断度可以更好地反映实际系统的故障模式.3元n立方是一种性质优良并且应用广泛的网络拓扑,在许多分布式多处理器的构建中被用做底层网络.根据一些引理以及确定系统为t/k可诊断的充分条件,研究得出当n≥3及0≤k≤n,3元n立方是tk,n/k-可诊断的,其中tk,n=2(k+1)n-(k+1)(k+2).这个结果显示,在选择恰当的k值时,3元n立方的t/k可诊断度tk,n远大于其t可诊断度2n和t1/t1可诊断度4n-3. 相似文献