期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scheduling Protocols for Switches with Large Envelopes

Matthew Andrews Lisa Zhang 《Journal of Scheduling》2004,7(3):171-186

Traditionally, switches make scheduling decisions on the granularity of a packet. However, this is becoming increasingly difficult since network bandwidth is growing rapidly whereas packet sizes remain largely unchanged. Therefore the service time of an individual packet is decreasing rapidly. In this paper we study switches that make scheduling decisions on the granularity of an envelope which can be much larger than a packet in size. For an output-queued switch with envelope size E, each output chooses one input every E time steps and transmits packets from this chosen input during the next E steps. For an input-queued switch with envelope size E, one matching from the inputs to the outputs is computed every E steps and only the input–output pairs that are defined by this matching are allowed to transmit packets during the next E steps. Traditional switches correspond to envelope size E = 1 and almost all previous scheduling work deals with this case exclusively. We first show how some stable protocols for scheduling networks of output-queued switches with E = 1 fail for arbitrary E when these protocols are generalized in the most straightforward manner. We then present an extremely simple protocol that does guarantee network stability for output-queued switches for any E ≥ 1. For input-queued switches we first present a max-weight matching protocol that is stable for a single switch with arbitrary E. We then present a more complex protocol that achieves stability for a network of input-queued switches for any E ≥ 1. 相似文献

2.

Bandwidth guaranteed multicast scheduling for virtual output queued packet switches

Deng Pan Yuanyuan Yang 《Journal of Parallel and Distributed Computing》2009

Multicast enables efficient data transmission from one source to multiple destinations, and has been playing an important role in Internet multimedia applications. Although several multicast scheduling schemes for packet switches have been proposed in the literature, they usually aim to achieve only short multicast latency and high throughput without considering bandwidth guarantees. However, fair bandwidth allocation is critical for the quality of service (QoS) of the network, and is necessary to support multicast applications requiring guaranteed performance services, such as online audio and video streaming. This paper addresses the issue of bandwidth guaranteed multicast scheduling on virtual output queued (VOQ) switches. We propose the Credit based Multicast Fair scheduling (CMF) algorithm, which aims at achieving not only short multicast latency but also fair bandwidth allocation. CMF uses a credit based strategy to guarantee the reserved bandwidth of an input port on each output port of the switch. It keeps track of the difference between the reserved bandwidth and actually received bandwidth, and minimizes the difference to ensure fairness. Moreover, in order to fully utilize the multicast capability provided by the switch, CMF lets a multicast packet simultaneously send transmission requests to multiple output ports. In this way, a multicast packet has more chances to be delivered to multiple destination output ports in the same time slot and thus to achieve short multicast latency. Extensive simulations are conducted to evaluate the performance of CMF, and the results demonstrate that CMF achieves the two design goals: fair bandwidth allocation and short multicast latency. 相似文献

3.

Synchronous versus asynchronous operation of a packet switch with combined input and output queueing

Ilias Iliadis 《Performance Evaluation》1992,16(1-3):241-250

A single-stage non-blocking N × N packet switch with combined input and output queueing is considere. The limited queueing at the output ports partially resolves output port contention. Overflow at the output queues is prevented by employment of a backpressure mechanism and additional queueing at the input ports. This paper investigates the performance of the switch under two different modes of operation: asynchronous and synchronous or slotted. For the purpose of comparison a switch model is developed. Assuming Poisson packet arrivals, several performance measures are obtained analytically. These include the distribution of the delay through the switch, the input queue length distribution, packet losses at the inputs in the case of finite input queues, and the maximum switch throughput. The results obtained demonstrate a slight performance advantage of asynchronous over synchronous operation. However, the maximum switch throughput is the same for both modes of operation. 相似文献

4.

Tiny Tera: a packet switch core

McKeown N. Izzard M. Mekkittikul A. Ellersick W. Horowitz M. 《Micro, IEEE》1997,17(1):26-33

Describes Tiny Tera: a small, high-bandwidth, single-stage switch. Tiny Tera has 32 ports switching fixed-size packets, each operating at over 10 Gbps (approximately the Sonet OC-192e rate, a telecom standard for system interconnects). The switch distinguishes four classes of traffic and includes efficient support for multicasting. We aim to demonstrate that it is possible to use currently available CMOS technology to build this compact switch with an aggregate bandwidth of approximately 1 terabit per second and a central hub no larger than a can of soda. Such a switch could serve as a core for an ATM switch or an Internet router. Tiny Tera is an input-buffered switch, which makes it the highest bandwidth switch possible given a particular CMOS and memory technology. The switch consists of three logical elements: ports, a central crossbar switch, and a central scheduler. It queues packets at a port on entry and optionally prior to exit. The scheduler, which has a map of each port's queue occupancy, determines the crossbar configuration every packet time slot. Input queueing, parallelism, and tight integration are the keys to such a high-bandwidth switch. Input queueing reduces the memory bandwidth requirements: When a switch queues packets at the input, the buffer memories need run no faster than the line rate. Thus, there is no need for the speedup required in output-queued switches 相似文献

5.

Multicast support in multi-chip centralized schedulers in Input Queued switches

Andrea Bianco Alessandra Scicchitano 《Computer Networks》2009,53(7):1040-1049

IQ switches store packets at input ports to avoid the memory speedup required by OQ switches. However, packet schedulers are needed to determine an I/O (input/output) interconnection pattern that avoids conflicts among packets at output ports. Today, centralized, single-chip, scheduler implementation are largely dominant. In the near future, the multi-chip scheduler implementation will be needed to reduce the hardware scheduler complexity in very large, high-speed, switches. However, the multi-chip implementation implies introducing a non-negligible delay among input and output selectors used to determine the I/O interconnection pattern at each time slot. This delay, mainly due to inter-chip latency, requires modifications to traditional scheduling algorithms, which normally rely on the hypothesis that information exchange among selectors can be performed with negligible delay. We propose a novel multicast scheduler, named IMRR, an extension of a previously proposed multicast scheduling algorithm named mRRM, making it suitable to a multi-chip implementation, and examine its performance by simulation. 相似文献

6.

iSLIP调度算法研究及其实现 总被引：4，自引：0，他引：4

刘化君刘斌《小型微型计算机系统》2003,24(9):1593-1596

目前，为提高交换系统吞吐率，设计开发高性能网络交换机或路由器内部交换结构的技术已趋成熟．但易于在硬件中实现的、高效的队列调度算法仍然是一项值得研究的重要技术．文章首先讨论了对于输入缓冲采用FIF0队列交换系统，其吞吐率主要受HOL队首阻塞的影响．然后研究了iSLIP调度算法的基本原理、迭代仲裁步骤及它在硬件中的实现．针对硬件交换转发判决这一关键问题，给出了在输入队列交换机中采用虚拟输出队列的交换结构和多优先级调度算法的硬件实现方案．最后，对isLIP算法的性能进行了分析比较，证明isLIP算法的实现方案不仅实现简单，而且具有良好的特性．相似文献

7.

在CICQ交换结构下实现分布式的输入排队DRR分组公平调度

王荣陈越《计算机应用》2005,25(7):1488-1490,1493

传统的基于crossbar的输入排队交换结构在提供良好的QoS方面存在很大的不足,而CICQ(combined input and crosspoint buffered queuing)交换结构与传统的交换结构比,不但能在各种输入流下提供接近输出排队的吞吐率,而且能提供良好的QoS支持。基于CICQ结构,提出了在输入排队条件下实现基于流的分布式DRR分组公平调度算法的方案,并通过仿真验证了这一方案的有效性。相似文献

8.

输入缓冲交换开关的多步调度策略

孙志刚卢锡城《软件学报》2001,12(8):1170-1176

输入缓冲交换开关已经在越来越多的ATM交换机和高性能路由器中使用.对于独立的信元到达,VOQ(virtual output queueing)技术与LQF(1ongest queue first)和OCF(oldest cell first)等加权调度算法的结合使用可以使利用交换开关的吞吐率达到100%.然而LQF和OCF等加权调度算法过于复杂,无法用硬件实现.提出了多步调度策略,使得用硬件实现加权调度算法成为可能.在该策略下,对于独立的信元到达,LQF算法仍可以达到100%的利用开关吞吐率,并具有良好的相似文献

9.

Max-Min Fair Scheduling in Input-Queued Switches 总被引：1，自引：0，他引：1

Hosaagrahara M. Sethu H. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(4):462-475

Fairness in traffic management can improve the isolation between traffic streams, offer a more predictable performance, eliminate transient bottlenecks, mitigate the effect of certain kinds of denial-of-service attacks, and serve as a critical component of a quality-of-service strategy to achieve certain guaranteed services such as delay bounds and minimum bandwidths. In this paper, we choose a popular notion of fairness called max-min fairness and provide a rigorous definition in the context of input-queued switches. We show that being fair at the output ports alone or at the input ports alone or even at both input and output ports does not actually achieve an overall max-min fair allocation of bandwidth in a switch. Instead, we propose a new algorithm that can be readily implemented in a distributed fashion at the input and output ports to determine the exact max-min fair rate allocations for the flows through the switch. In addition to proving the correctness of the algorithm, we propose a practical scheduling strategy based on our algorithm. We present simulation results, using both real traffic traces and synthetic traffic, to evaluate the fairness of a variety of popular scheduling algorithms for input-queued switches. The results show that our scheduling strategy achieves better fairness than other known algorithms for input-queued switches and, in addition, achieves throughput performance very close to that of the best schedulers. 相似文献

10.

CICQ交换的流控实现机制和交叉点缓存容量分析

王荣林予松《计算机工程》2006,32(7):240-242

传统的基于crossbar的输入排队交换结构在提供良好的QOS方面存在很大的不足，而CICQ（combined input and crosspoint buffered queuing）交换结构与传统的交换结构相比，不但能在各种输入流下提供接近输出排队的吞吐率，而且能提供良好的QoS支持。文章分析了CICQ结构的流控实现机制，讨论了基于信用的流控机制的开销和实现方案，对crosspoint缓存容鼍作了分析，给出了在各种存储器写入条件下，保持交换结构100％吞吐率所需的最小缓存容量。相似文献

11.

HIPIQS: a high-performance switch architecture using input queuing

Sivaram R. Stunkel C.B. Panda D.K. 《Parallel and Distributed Systems, IEEE Transactions on》2002,13(3):275-289

Switch-based interconnects are used in a number of application domains, including parallel system interconnects, local area networks, and wide area networks. However, very few switches have been designed that are suitable for more than one of these application domains. Such a switch must offer both extremely low latency and very high throughput for a variety of different message sizes. While some architectures with output queuing have been shown to perform extremely well in terms of throughput, their performance can suffer when used in systems where a significant portion of the packets are extremely small. On the other hand, architectures with input queuing offer limited throughput or require fairly complex and centralized arbitration that increases latency. In this paper, we present a new input queue-based switch architecture called HIPIQS (HIgh-Performance Input-Queued Switch). It offers low latency for a range of message sizes and provides throughput comparable to that of output queuing approaches. Furthermore, it allows simple and distributed arbitration. HIPIQS uses a dynamically allocated multiqueue organization, pipelined access to multibank input buffers, and small cross-point buffers to deliver high performance. Our simulation results show that HIPIQS can deliver performance close to that of output queuing approaches over a range of message sizes, system sizes, and traffic. The switch architecture can therefore be used to build high performance switches that are useful for both parallel system interconnects and for building computer networks 相似文献

12.

缓冲交叉开关交换结构多播调度算法研究 总被引：1，自引：0，他引：1

孙书韬贺思敏郑燕峰高文《计算机研究与发展》2006,43(6):1036-1043

高性能核心交换设备多播调度受到越来越多的关注·交叉开关结构下的多播调度方案或者性能较差,或者过于复杂,难于应用在高速交换场合·为此,提出一种面向多播的多输入队列缓冲交叉开关体系结构·将多播调度分解为信元分派、输入调度、输出调度3个可分布式并行执行的子问题,并设计了相应的调度算法,降低了算法复杂性·实验结果表明,交叉点缓冲区容量与输入队列数量对多播性能都具有很大的影响·在突发流量到达下,与单多播输入队列的体系结构相比,无论是采用O(1)复杂度的HA-RR-RR还是复杂度更高的调度算法,均能显著提高系统吞吐性能· 相似文献

13.

一种交叉点小缓存CICQ交换机高性能调度算法 总被引：5，自引：1，他引：5

李勇罗军舟吴俊《计算机研究与发展》2006,43(12):2033-2040

CICQ（combined input crosspoint queued/queuing）结构具有内部无需提速及输入和输出的分组调度可以分布并行执行的优点,使用RR（round robin）算法在高性能交换机设计中具有独特优势.然而,CICQ交换机使用RR算法在非均匀流量下不能达到100%的吞吐率. RR-RR算法在非均匀流量下性能有两个关键因素组成：中央缓存容量大小和输入端长队列未能及时服务导致的服务损失.基于理论分析,提出了一种小缓存高性能调度算法,仿真结果表明,即使在1个信元缓存的情况下新算法在均匀与非均匀流量下均能达到100%吞吐率.新算法仅具有O（1）的复杂度,保持了RR-RR算法简单有效特性,同时克服了RR-RR算法在非均匀流量下的不稳定性. 相似文献

14.

Port partitioning and dynamic queueing for IP forwarding

《Computers & Operations Research》2002,29(9):1157-1172

With the increase of internet protocol (IP) packets the performance of routers became an important issue in internet/working. In this paper we examine the matching algorithm in gigabit router which has input queue with virtual output queueing. Dynamic queue scheduling is also proposed to reduce the packet delay and packet loss probability. Port partitioning is employed to reduce the computational burden of the scheduler in a switch which matches the input and output ports for fast packet switching. Each port is divided into two groups such that the matching algorithm is implemented within each pair of groups in parallel. The matching is performed by exchanging the pair of groups at every time slot. Two algorithms, maximal weight matching by port partitioning (MPP) and modified maximal weight matching by port partitioning (MMPP) are presented. In dynamic queue scheduling, a popup decision rule for each delay critical packet is made to reduce both the delay of the delay critical packet and the loss probability of loss critical packet. Computational results show that MMPP has the lowest delay and requires the least buffer size. The throughput is illustrated to be linear to the packet arrival rate, which can be achieved under highly efficient matching algorithm. The dynamic queue scheduling is illustrated to be highly effective when the occupancy of the input buffer is relatively high.Scope and purposeTo cope with the increasing internet traffic, it is necessary to improve the performance of routers. To accelerate the switching from input ports to output in the router partitioning of ports and dynamic queueing are proposed. Input and output ports are partitioned into two groups A/B and a/b, respectively. The matching for the packet switching is performed between group pairs (A, a) and (B, b) in parallel at one time slot and (A, b) and (B, a) at the next time slot. Dynamic queueing is proposed at each input port to reduce the packet delay and packet loss probability by employing the popup decision rule and applying it to each delay critical packet.The partitioning of ports is illustrated to be highly effective in view of delay, required buffer size and throughput. The dynamic queueing also demonstrates good performance when the traffic volume is high. 相似文献

15.

Design and performance of speculative flow control for high-radix datacenter interconnect switches

Cyriel Minkenberg Mitchell Gusat 《Journal of Parallel and Distributed Computing》2009

High-radix switches are desirable building blocks for large computer interconnection networks, because they are more suitable to convert chip I/O bandwidth into low latency and low cost than low-radix switches [J. Kim, W.J. Dally, B. Towles, A.K. Gupta, Microarchitecture of a high-radix router, in: Proc. ISCA 2005, Madison, WI, 2005]. Unfortunately, most existing switch architectures do not scale well to a large number of ports, for example, the complexity of the buffered crossbar architecture scales quadratically with the number of ports. Compounded with support for long round-trip times and many virtual channels, the overall buffer requirements limit the feasibility of such switches to modest port counts. Compromising on the buffer sizing leads to a drastic increase in latency and reduction in throughput, as long as traditional credit flow control is employed at the link level. We propose a novel link-level flow control protocol that enables high-performance scalable switches that are based on the increasingly popular buffered crossbar architecture, to scale to higher port counts without sacrificing performance. By combining credited and speculative transmission, this scheme achieves reliable delivery, low latency, and high throughput, even with crosspoint buffers that are significantly smaller than the round-trip time. The proposed scheme substantially reduces message latency and improves throughput of partially buffered crossbar switches loaded with synthetic uniform and non-uniform bursty traffic. Moreover, simulations replaying traces of several typical MPI applications demonstrate communication speedup factors of 2 to 10 times. 相似文献

16.

A new shared-buffer packet switch in ATM networks

《Computer Communications》2001,24(3-4):445-451

To improve overall cell loss rate and fairness problem in shared-buffer switches, a new output-queued shared-buffer switch is proposed in this paper. In the proposed shared-buffer switches, a novel structure of output queues is used to improve the behavior of cell loss rate and resolve the fairness problem. Performance of the proposed output-queued shared-buffer switch is analyzed and compared. According to the analyzed results, the proposed shared-buffer switches show superior performance and overcome the fairness problem. Besides, the proposed buffer architecture is simple to implement in the high-speed packet switches. 相似文献

17.

Designing and implementing a fast crossbar scheduler 总被引：1，自引：0，他引：1

Gupta P. McKeown N. 《Micro, IEEE》1999,19(1):20-28

Crossbar switches frequently function as the internal switching fabric of high performance network switches and routers. However, for fairness and high utilization, a crossbar needs an intelligent, centralized scheduler. We describe the design and implementation of a scheduling algorithm for configuring crossbars in input queued switches that support virtual output queues and multiple priority levels of unicast and multicast traffic. We carried out this design for Stanford University's Tiny Tera prototype, a fast, label-swapping packet switch. Its scheduler, designed to configure a crossbar once every 51 ns, implements the ESLIP scheduling algorithm, which consists of multiple round-robin arbiters 相似文献

18.

一种基于输入队列的交换机快速会聚调度算法 总被引：1，自引：0，他引：1

刘东钢侯紫峰《计算机工程与应用》2002,38(1):150-153,190

随着网络带宽需求的增加,高性能交换机的地位日趋重要。交换机包括3个部分:(1)在输入端口保存到达此端口的信元的输入缓冲。(2)在输出端口保存将要发送的信元的输出缓冲。(3)调度输入信元到所需输出端口的调度模块。当由多个输入端口要求输出到同一输出端口的时候由此调度算法来裁决一个输入输出对。一般而言,交换机的性能很大一部分取决于这一调度算法的性能,但并不希望这一调度算法成为交换机性能的瓶颈。该文讨论了许多近年来常用的算法,在此基础上同时提出一种新的的调度算法。通过计算机模拟结果可以看出这种算法具有更高的效率,更快的会聚速度。相似文献

19.

基于Virtual Output Queued交换结构的最大权重匹配算法

鄂大伟《计算机工程与应用》2001,37(18):66-69

信头阻塞（HOL）限制了采用FIFO输入队列交换机的吞吐率,而使用虚输出队列（VOQ）技术可以完全消除HOL阻塞。文章给出了VOQ的交换机模型,介绍了基于最大权重匹配的算法LQF、OCF、LPF及其性能,还描述了更加实用的并行迭代算法i－LQF、i－OCF和i－LPF。文章的结论对于构造高带宽的交换机具有实际意义。相似文献

20.

Achieving fair service with a hybrid scheduling scheme for CICQ switches

HU HongChao GUO YunFei YI Peng & LAN JuLong 《中国科学:信息科学(英文版)》2012,(3):689-700

Providing performance guarantees for arriving traffic flows has become an important measure for today’s routing and switching systems. However, none of current scheduling algorithms built on CICQ (combined input and cross-point buffered) switches can provide flow level performance guarantees. Aiming at meeting this requirement, the feasibility of implementing flow level scheduling is discussed thoroughly. Then, based on the discussion, it comes up with a hybrid and stratified fair scheduling (HSFS) scheme, which is hierarchical and hybrid, for CICQ switches. With HSFS, each input port and output port can schedule variable length packets independently with a complexity of O(1). Theoretical analysis show that HSFS can provide delay bound, service rate and fair performance guarantees without speedup. Finally, we implement HSFS in SPES (switch performance evaluation system) to verify the analytical results. 相似文献