期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

孙书韬贺思敏郑燕峰高文《计算机研究与发展》2006,43(6):1036-1043

高性能核心交换设备多播调度受到越来越多的关注·交叉开关结构下的多播调度方案或者性能较差,或者过于复杂,难于应用在高速交换场合·为此,提出一种面向多播的多输入队列缓冲交叉开关体系结构·将多播调度分解为信元分派、输入调度、输出调度3个可分布式并行执行的子问题,并设计了相应的调度算法,降低了算法复杂性·实验结果表明,交叉点缓冲区容量与输入队列数量对多播性能都具有很大的影响·在突发流量到达下,与单多播输入队列的体系结构相比,无论是采用O(1)复杂度的HA-RR-RR还是复杂度更高的调度算法,均能显著提高系统吞吐性能· 相似文献

2.

RRR: recursive round robin scheduler

《Computer Networks》1999,31(18):1951-1966

Scheduling has been an interesting problem since its inception. In the context of real-time networks, a scheduling algorithm is concerned with dispatching streams of packets sharing the same bandwidth such that certain guaranteed performance for each stream like rate and delay bound is provided. This function has a wide range of applications in network elements such as host adaptors, routers and switches. This paper proposes and describes a new scheduling algorithm named as recursive round robin (RRR) scheduler. It is based on the concept of the construction of a scheduling tree in which distinct cell streams are scheduled recursively. Special emphasis is placed on the design and analysis of the scheduler. A delay bound is analytically derived for the scheduler and verified using simulation. The scheduler can work in either a work-conserving mode or non-work-conserving mode. It is shown that the work conserving scheduler is fair. Fairness indexes for the work conserving scheduler are analytically derived. The simple nature of this algorithm makes it possible to implement it at very high speeds, while considerably reducing the implementation cost. 相似文献

3.

Designing and implementing a fast crossbar scheduler 总被引：1，自引：0，他引：1

Gupta P. McKeown N. 《Micro, IEEE》1999,19(1):20-28

Crossbar switches frequently function as the internal switching fabric of high performance network switches and routers. However, for fairness and high utilization, a crossbar needs an intelligent, centralized scheduler. We describe the design and implementation of a scheduling algorithm for configuring crossbars in input queued switches that support virtual output queues and multiple priority levels of unicast and multicast traffic. We carried out this design for Stanford University's Tiny Tera prototype, a fast, label-swapping packet switch. Its scheduler, designed to configure a crossbar once every 51 ns, implements the ESLIP scheduling algorithm, which consists of multiple round-robin arbiters 相似文献

4.

On the Integration of Unicast and Multicast Cell Scheduling in Buffered Crossbar Switches 总被引：1，自引：0，他引：1

Mhamdi L. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(6):818-830

Internet traffic is a mixture of unicast and multicast flows. Integrated schedulers capable of dealing with both traffic types have been designed mainly for Input Queued (IQ) buffer-less crossbar switches. Combined Input and crossbar queued (CICQ) switches, on the other hand, are known to have better performance than their buffer-less predecessors due to their potential in simplifying the scheduling and improving the switching performance. The design of integrated schedulers in CICQ switches has thus far been neglected. In this paper, we propose a novel CICQ architecture that supports both unicast and multicast traffic along with its appropriate scheduling. In particular, we propose an integrated round-robin-based scheduler that efficiently services both unicast and multicast traffic simultaneously. Our scheme, named multicast and unicast round robin scheduling (MURS), has been shown to outperform all existing schemes under various traffic patterns. Simulation results suggested that we can trade the size of the internal buffers for the number of input multicast queues. We further propose a hardware implementation of our algorithm for a 16 times 16 buffered crossbar switch. The implementation results suggest that MURS can run at 20 Gbps line rate and a clock cycle time of 2.8 ns, reaching an aggregate switching bandwidth of 320 Gbps. 相似文献

5.

Scheduling multicast traffic in input-buffered ATM switch

《Computer Communications》2001,24(15-16):1607-1617

Performance of an input-buffered ATM switch is limited by the head-of-line (HOL) blocking problem. HOL blocking is even more pronounced in multicast switch where cells compete for multiple outputs simultaneously. Previous studies in input-buffered unicast switch have shown that HOL blocking can only be eliminated by using per-output queuing and sophisticated scheduling methods, such as the maximal weight matching (MWM) or the parallel iterative matching (PIM) methods. The MWM or PIM types of scheduling algorithm cannot be applied to multicast switch because of the high computation complexity. In this paper, we present a reservation based scheduling algorithm, which employs per-VC queuing for multicast connections and per-output queuing for unicast connections. Instead of the input ports sending a huge amount of state information to the output ports for processing, we circulate reservation vectors amongst the input ports. Each input port will then make reservation based on its local state and the availability of the output ports. The scheduling is done on a frame by frame basis. While the input ports are transmitting cells according to the schedule of the current frame, the next frame schedule is computed. Simulations reveal that our method substantially outperforms the methods that employ FIFO queuing discipline. 相似文献

6.

Anchored opportunity queueing: a low-latency scheduler for fair arbitration among virtual channels

Salil S. Kanhere Harish Sethu 《Journal of Parallel and Distributed Computing》2003,63(12):1288

Many wormhole interconnection networks for parallel systems, and more recently system area networks, implement virtual channels to provide a number of services including improved link utilization and lower latencies. The forwarding of flits from the virtual channels on to the physical channel is typically accomplished using flit-based round-robin (FBRR) scheduling. This paper presents a novel scheduling strategy, anchored opportunity queueing (AOQ), which preserves the throughput and fairness characteristics of FBRR while significantly reducing the average delay experienced by packets. The AOQ scheduler achieves lower average latencies by trying, as far as possible, to complete the transmission of a complete packet before beginning the transmission of flits from another packet. The AOQ scheduler achieves provable fairness in the number of opportunities it offers to each of the virtual channels for transmissions of flits over the physical channel. We prove this by showing that the relative fairness bound, a popular measure of fairness, is a small finite constant in the case of the AOQ scheduler. Finally, we present simulation results comparing the delay characteristics of AOQ with other schedulers for virtual channels. The AOQ scheduler is simple to implement in hardware, and also offers a practical solution in other contexts such as in scheduling ATM cells in Internet backbone switches. 相似文献

7.

Port partitioning and dynamic queueing for IP forwarding

《Computers & Operations Research》2002,29(9):1157-1172

With the increase of internet protocol (IP) packets the performance of routers became an important issue in internet/working. In this paper we examine the matching algorithm in gigabit router which has input queue with virtual output queueing. Dynamic queue scheduling is also proposed to reduce the packet delay and packet loss probability. Port partitioning is employed to reduce the computational burden of the scheduler in a switch which matches the input and output ports for fast packet switching. Each port is divided into two groups such that the matching algorithm is implemented within each pair of groups in parallel. The matching is performed by exchanging the pair of groups at every time slot. Two algorithms, maximal weight matching by port partitioning (MPP) and modified maximal weight matching by port partitioning (MMPP) are presented. In dynamic queue scheduling, a popup decision rule for each delay critical packet is made to reduce both the delay of the delay critical packet and the loss probability of loss critical packet. Computational results show that MMPP has the lowest delay and requires the least buffer size. The throughput is illustrated to be linear to the packet arrival rate, which can be achieved under highly efficient matching algorithm. The dynamic queue scheduling is illustrated to be highly effective when the occupancy of the input buffer is relatively high.Scope and purposeTo cope with the increasing internet traffic, it is necessary to improve the performance of routers. To accelerate the switching from input ports to output in the router partitioning of ports and dynamic queueing are proposed. Input and output ports are partitioned into two groups A/B and a/b, respectively. The matching for the packet switching is performed between group pairs (A, a) and (B, b) in parallel at one time slot and (A, b) and (B, a) at the next time slot. Dynamic queueing is proposed at each input port to reduce the packet delay and packet loss probability by employing the popup decision rule and applying it to each delay critical packet.The partitioning of ports is illustrated to be highly effective in view of delay, required buffer size and throughput. The dynamic queueing also demonstrates good performance when the traffic volume is high. 相似文献

8.

Bandwidth guaranteed multicast scheduling for virtual output queued packet switches

Deng Pan Yuanyuan Yang 《Journal of Parallel and Distributed Computing》2009

Multicast enables efficient data transmission from one source to multiple destinations, and has been playing an important role in Internet multimedia applications. Although several multicast scheduling schemes for packet switches have been proposed in the literature, they usually aim to achieve only short multicast latency and high throughput without considering bandwidth guarantees. However, fair bandwidth allocation is critical for the quality of service (QoS) of the network, and is necessary to support multicast applications requiring guaranteed performance services, such as online audio and video streaming. This paper addresses the issue of bandwidth guaranteed multicast scheduling on virtual output queued (VOQ) switches. We propose the Credit based Multicast Fair scheduling (CMF) algorithm, which aims at achieving not only short multicast latency but also fair bandwidth allocation. CMF uses a credit based strategy to guarantee the reserved bandwidth of an input port on each output port of the switch. It keeps track of the difference between the reserved bandwidth and actually received bandwidth, and minimizes the difference to ensure fairness. Moreover, in order to fully utilize the multicast capability provided by the switch, CMF lets a multicast packet simultaneously send transmission requests to multiple output ports. In this way, a multicast packet has more chances to be delivered to multiple destination output ports in the same time slot and thus to achieve short multicast latency. Extensive simulations are conducted to evaluate the performance of CMF, and the results demonstrate that CMF achieves the two design goals: fair bandwidth allocation and short multicast latency. 相似文献

9.

A novel multicast mechanism for optical local area networks

Ho-Ting Wu^{Author Vitae} Kai-Wei Ke Author Vitae Author Vitae 《Computers & Electrical Engineering》2007,33(2):94-108

In this paper, we propose a simple and efficient multicast scheduling algorithm, the Lookback Queue Access (LBQA) protocol for the employment in an asynchronous WDM optical star network. Network nodes use ALOHA based random access schemes to send their reservation packets over a common packet reservation control channel. A central traffic scheduler is allocated to coordinate message transmissions. The central scheduler carries out the LBQA scheme in real time in two phases. The first phase is to search (through input queues) for a candidate multicast message that can be sent, without partition, to all of its intended recipients. If phase 1 fails, the second phase is then activated to efficiently partition a multicast message into multiple transmissions. Performance results conducted via simulations have revealed that networks employing such a scheduling mechanism can exhibit superior network throughput levels and delay performances. To obtain sustainable high performances, we further propose an interconnected dual-star structure employed with the enhanced scheduling mechanism, the DS_LBQA algorithm. By using a simple heuristic approach, the DS_LBQA scheme is able to successfully exploit the inter data channels and properly utilize the wavelength reuse property of the intra data channels. Performance results have demonstrated the merits of deploying the DS_LBQA multicast algorithm in such a dual-star structure. 相似文献

10.

An efficient single-iteration single-bit request scheduling algorithm for input-queued switches

Bing Hu Kwan L. Yeung Zhaoyang Zhang 《Journal of Network and Computer Applications》2013,36(1):187-194

Aiming at minimizing communication overhead of iterative scheduling algorithms for input-queued packet switches, an efficient single-iteration single-bit request scheduling algorithm called Highest Rank First with Request Compression 1 (HRF/RC1) is proposed. In HRF/RC1, scheduling priority is given to the preferred input–output pair first, where each input has a distinct preferred output in each time slot. If an input does not have backlogged packets for its preferred output, each of its non-empty VOQs sends a single-bit request to the corresponding output. This single bit distinguishes one longest VOQ from other non-empty VOQs among an input port. If an output receives a request from its preferred input, it grants this input. Otherwise, it gives the higher priority to the longest VOQ than other non-empty VOQs. Similarly, an input accepts the grant following the same propriety sequence. In case of a tie, the winner is selected randomly. Compared with other single-iteration algorithms with comparable communication overhead, we show by simulations that HRF/RC1 always gives the best delay-throughput performance. 相似文献

11.

Max-Min Fair Scheduling in Input-Queued Switches 总被引：1，自引：0，他引：1

Hosaagrahara M. Sethu H. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(4):462-475

Fairness in traffic management can improve the isolation between traffic streams, offer a more predictable performance, eliminate transient bottlenecks, mitigate the effect of certain kinds of denial-of-service attacks, and serve as a critical component of a quality-of-service strategy to achieve certain guaranteed services such as delay bounds and minimum bandwidths. In this paper, we choose a popular notion of fairness called max-min fairness and provide a rigorous definition in the context of input-queued switches. We show that being fair at the output ports alone or at the input ports alone or even at both input and output ports does not actually achieve an overall max-min fair allocation of bandwidth in a switch. Instead, we propose a new algorithm that can be readily implemented in a distributed fashion at the input and output ports to determine the exact max-min fair rate allocations for the flows through the switch. In addition to proving the correctness of the algorithm, we propose a practical scheduling strategy based on our algorithm. We present simulation results, using both real traffic traces and synthetic traffic, to evaluate the fairness of a variety of popular scheduling algorithms for input-queued switches. The results show that our scheduling strategy achieves better fairness than other known algorithms for input-queued switches and, in addition, achieves throughput performance very close to that of the best schedulers. 相似文献

12.

Packet synchronization for synchronous optical deflection-routedinterconnection networks

Feehrer J.R. Ramfelt L.H. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(6):605-611

Deflection routing resolves output port contention in packet switched multiprocessor interconnection networks by granting the preferred port to the highest priority packet and directing contending packets out other ports. When combined with optical links and switches, deflection routing yields simple bufferless nodes, high bit rates, scalable throughput, and low latency. We discuss the problem of packet synchronization in synchronous optical deflection networks with nodes distributed across boards, racks, and cabinets. Synchronous operation is feasible due to very predictable optical propagation delays. A routing control processor at each node examines arriving packets and assigns them to output ports. Packets arriving on different input ports must be bit wise aligned; there are no elastic buffers to correct for mismatched arrivals. “Time of flight” packet synchronization is done by balancing link delays during network design. Using a directed graph network model, we formulate a constrained minimization problem for minimizing link delays subject to synchronization and packaging constraints. We demonstrate our method on a ShuffleNet graph, and show modifications to handle multiple packet sizes and latency critical paths 相似文献

13.

Achieving fair service with a hybrid scheduling scheme for CICQ switches

HU HongChao GUO YunFei YI Peng & LAN JuLong 《中国科学:信息科学(英文版)》2012,(3):689-700

Providing performance guarantees for arriving traffic flows has become an important measure for today’s routing and switching systems. However, none of current scheduling algorithms built on CICQ (combined input and cross-point buffered) switches can provide flow level performance guarantees. Aiming at meeting this requirement, the feasibility of implementing flow level scheduling is discussed thoroughly. Then, based on the discussion, it comes up with a hybrid and stratified fair scheduling (HSFS) scheme, which is hierarchical and hybrid, for CICQ switches. With HSFS, each input port and output port can schedule variable length packets independently with a complexity of O(1). Theoretical analysis show that HSFS can provide delay bound, service rate and fair performance guarantees without speedup. Finally, we implement HSFS in SPES (switch performance evaluation system) to verify the analytical results. 相似文献

14.

支持公平服务的CICQ分层混合调度策略

扈红超郭云飞伊鹏兰巨龙《中国科学:信息科学》2012,(4):410-422

为到达业务提供性能保障是衡量一个交换系统性能的重要参考.针对现有联合输入交叉点排队交换结构(CICQ)调度策略缺乏基于流的服务质量保障,探讨了在CICQ交换结构实施基于"流"调度的可能性,提出了一种能够为到达业务流的提供公平服务的分层混合调度策略(HSFS).HSFS采用分层的混合调度机制,每个输入、输出端口可独立地进行变长分组交换,其复杂度为O(1),具有良好可扩展特性.理论分析结果表明,HSFS无需加速便能为到达业务提供时延上限、速率和公平性保障.最后,基于SPES对HSFS的性能进行了评估. 相似文献

15.

并行交换中支持包保序的新技术

余美荣赵宝宽董雨果郭云飞《计算机工程与应用》2004,40(33):155-157,175

由于并行交换结构的负载平衡特性和并行原理,到达同一目的输出端口的分组包被分散到了各个交换模块,当它们抵达输出端口时,其先后顺序无法得到保障。文章为解决该难题提出一种新技术,包括虚拟输入排队(VIQ)结构和包保序轮询(SKRR)算法,并且从理论上分析了这种新技术的吞吐率和时延性能。相似文献

16.

Fast and noniterative scheduling in input-queued switches: Supporting QoS

Kevin F. Chen Edwin H.-M. Sha S.Q. Zheng 《Computer Communications》2009,32(5):834-846

We report three fast and scalable scheduling algorithms that provide exact bandwidth guarantee, low delay bound, and reasonable jitter in input-queued switches. The three schedulers find a maximum input/output matching in a single iteration. They sustain 100% throughput under both uniform and bursty traffic. They work many times faster than existing scheduling schemes and their speed does not degrade with increased switch size. SRA and SRA+ algorithms are of O(1) time complexity and can be implemented in simple hardware. SRA tends to incur different delays to flows of different classes of service due to their different subscribed portions of the total bandwidth. SRA+, a weighted version of SRA, operates on cells that arrive uniformly and on cells of packets such that the cells of whole packets are switched contiguously. SRA+ improves over SRA in that all flows undergo the same delays regardless of their bandwidth shares. The schedulers operate on queue groups at the crossbar arbiters in a distributed manner. 相似文献

17.

PIPORS: a parallel input parallel output register switching system

Der-Fu Tao Author Vitae Liang-Teh Lee Author Vitae 《Computers & Electrical Engineering》2004,30(6):427-440

In order to make data exchange speed fast enough for supporting the current communication systems or networks, a high speed switching system with low transmission delay and low data loss is required. Many researchers used statistical time division multiplexing techniques to design the switching system for achieving a higher throughput. In such switching systems with n input/output ports, the internal execution speed must be n times faster than the speed of the system with single input/output port. This designing philosophy is really not an appropriate way as the demand trend for higher speed system in the future.For improving the drawbacks of the switching system mentioned above, a novel, revolutionary architecture of a Parallel Input Parallel Output Register Switching System (PIPORS) is proposed in this paper. The PIPORS is based on the interconnection of the small distributed Shared Memory Modules (SMM) and the Shift Register Switch Array (SRSA). This construction will accelerate the switching speed. In addition, the number of input/output ports of the system can easily be extended for providing a higher capacity to respond to the trend of fast increasing amount of data transferred in the system. Three simple methods to extend the input/output ports and the capacity of the internal memory are presented.For evaluating the performance of the proposed system, we made some performance comparisons among our PIPORS and Central Shared Memory Switching System (CSMS) with respect to the amount of total memory required, data loss probability, transmission delay and switching performance. It shows that a better performance can be achieved in our PIPORS. 相似文献

18.

一种基于输入队列的交换机快速会聚调度算法 总被引：1，自引：0，他引：1

刘东钢侯紫峰《计算机工程与应用》2002,38(1):150-153,190

随着网络带宽需求的增加,高性能交换机的地位日趋重要。交换机包括3个部分:(1)在输入端口保存到达此端口的信元的输入缓冲。(2)在输出端口保存将要发送的信元的输出缓冲。(3)调度输入信元到所需输出端口的调度模块。当由多个输入端口要求输出到同一输出端口的时候由此调度算法来裁决一个输入输出对。一般而言,交换机的性能很大一部分取决于这一调度算法的性能,但并不希望这一调度算法成为交换机性能的瓶颈。该文讨论了许多近年来常用的算法,在此基础上同时提出一种新的的调度算法。通过计算机模拟结果可以看出这种算法具有更高的效率,更快的会聚速度。相似文献

19.

支持组播的输入队列ATM交换机设计及其调度策略研究 总被引：1，自引：0，他引：1

翟明玉顾冠群吴国新罗军舟《计算机学报》2000,23(5):509-515

目前基于输入队列技术的 ATM交换机的研究日益活跃 .输入队列单播调度算法的研究已经取得了较多研究成果 ,并已得到商业应用 .但输入队列组播调度算法的研究目标主要集中于提高吞吐量 ,而忽略了调度算法对组播流 Qo S的影响 ,如延迟等 .文中提出了一种支持组播功能的输入队列 ATM交换机的设计方案 ,并给出相应的输入队列组播调度算法 ,称为组播最长正则队列优先算法 (ML NQF) .调度算法 ML QNF具有改善吞吐量、满足Qo S需求和公平服务等特点 . 相似文献

20.

Design of a crossbar VOQ real-time switch with clock-driven scheduling for a guaranteed delay bound

Kyungtae Kang Kyung-Joon Park Lui Sha Qixin Wang 《Real-Time Systems》2013,49(1):117-135

Most commercial network switches are designed to achieve good average throughput and delay needed for Internet traffic, whereas hard real-time applications demand a bounded delay. Our real-time switch combines clearance-time-optimal switching with clock-based scheduling on a crossbar switching fabric. We use real-time virtual machine tasks to serve both periodic and aperiodic traffic, which simplifies analysis and provides isolation from other system operations. We can then show that any feasible traffic will be switched in two clock periods. This delay bound is enabled by introducing one-shot traffic, which can be constructed at the cost of a fixed delay of one clock period. We carry out simulation to compare our switch with the popular iSLIP crossbar switch scheduler. Our switch has a larger schedulability region, a bounded lower end-to-end switching delay, and a shorter clearance time which is the time required to serve every packet in the system. 相似文献