期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

伊鹏汪斌强陈庶樵李挥《软件学报》2009,20(8):2289-2297

提出一种交错编码的多重门限调度算法(interleaving coded multi-threshold scheduling,简称ICMTS).该算法将前、后级队列门限标记交错编码作为权值表征输入调度过程前、后两级队列的整体调度需求,根据交错编码的权值对前级虚拟输出队列进行优化调度判决,并通过多重门限机制降低算法的硬件资源开销.采用流模型证明当加速因子为2时,ICMTS算法可获得100%的吞吐量,并给出ICMTS算法的工程简化设计方案,复杂度为O(logN).仿真仿真结果表明,采用ICMTS算法的工程简化方案即可获得比现有算法更优的调度性能. 相似文献

2.

一种交叉点小缓存CICQ交换机高性能调度算法 总被引：6，自引：1，他引：5

李勇罗军舟吴俊《计算机研究与发展》2006,43(12):2033-2040

CICQ（combined input crosspoint queued/queuing）结构具有内部无需提速及输入和输出的分组调度可以分布并行执行的优点,使用RR（round robin）算法在高性能交换机设计中具有独特优势.然而,CICQ交换机使用RR算法在非均匀流量下不能达到100%的吞吐率. RR-RR算法在非均匀流量下性能有两个关键因素组成：中央缓存容量大小和输入端长队列未能及时服务导致的服务损失.基于理论分析,提出了一种小缓存高性能调度算法,仿真结果表明,即使在1个信元缓存的情况下新算法在均匀与非均匀流量下均能达到100%吞吐率.新算法仅具有O（1）的复杂度,保持了RR-RR算法简单有效特性,同时克服了RR-RR算法在非均匀流量下的不稳定性. 相似文献

3.

基于双轮转指针的输入与交叉点联合排队型(CICQ)交换结构调度算法 总被引：2，自引：1，他引：1

郑燕峰孙书韬贺思敏高文《计算机研究与发展》2006,43(7):1225-1232

CICQ交换结构因具有良好的分布式调度特性而成为构建太比特（Tb／s）级以上交换机的一种理想选择．轮转型调度算法因硬件实现的简单性而得到广泛的研究,尽管此类型的调度算法在均匀流量下具有较高的吞吐率,然而在非均匀的流量下其性能则明显下降．指出了已有轮转型算法在非均匀流量下性能下降的原因,提出了一类基于双指针的轮转型调度算法,即每个输入调度器均有两个轮转指针（主指针和辅助指针）．主指针对应的队列具有最高的调度优先级,算法可以根据各个队列的状态动态决定何时更新主指针。当主指针对应的队列被流控机制阻塞时,将根据辅助指针依次公平服务其他队列．实验结果表明,基于双指针的调度算法可以显著提高CICQ交换机在非均匀流量下的性能．相似文献

4.

Designing and implementing a fast crossbar scheduler 总被引：1，自引：0，他引：1

Gupta P. McKeown N. 《Micro, IEEE》1999,19(1):20-28

Crossbar switches frequently function as the internal switching fabric of high performance network switches and routers. However, for fairness and high utilization, a crossbar needs an intelligent, centralized scheduler. We describe the design and implementation of a scheduling algorithm for configuring crossbars in input queued switches that support virtual output queues and multiple priority levels of unicast and multicast traffic. We carried out this design for Stanford University's Tiny Tera prototype, a fast, label-swapping packet switch. Its scheduler, designed to configure a crossbar once every 51 ns, implements the ESLIP scheduling algorithm, which consists of multiple round-robin arbiters 相似文献

5.

CICQ交换的流控实现机制和交叉点缓存容量分析

王荣林予松《计算机工程》2006,32(7):240-242

传统的基于crossbar的输入排队交换结构在提供良好的QOS方面存在很大的不足，而CICQ（combined input and crosspoint buffered queuing）交换结构与传统的交换结构相比，不但能在各种输入流下提供接近输出排队的吞吐率，而且能提供良好的QoS支持。文章分析了CICQ结构的流控实现机制，讨论了基于信用的流控机制的开销和实现方案，对crosspoint缓存容鼍作了分析，给出了在各种存储器写入条件下，保持交换结构100％吞吐率所需的最小缓存容量。相似文献

6.

On the Integration of Unicast and Multicast Cell Scheduling in Buffered Crossbar Switches 总被引：1，自引：0，他引：1

Mhamdi L. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(6):818-830

Internet traffic is a mixture of unicast and multicast flows. Integrated schedulers capable of dealing with both traffic types have been designed mainly for Input Queued (IQ) buffer-less crossbar switches. Combined Input and crossbar queued (CICQ) switches, on the other hand, are known to have better performance than their buffer-less predecessors due to their potential in simplifying the scheduling and improving the switching performance. The design of integrated schedulers in CICQ switches has thus far been neglected. In this paper, we propose a novel CICQ architecture that supports both unicast and multicast traffic along with its appropriate scheduling. In particular, we propose an integrated round-robin-based scheduler that efficiently services both unicast and multicast traffic simultaneously. Our scheme, named multicast and unicast round robin scheduling (MURS), has been shown to outperform all existing schemes under various traffic patterns. Simulation results suggested that we can trade the size of the internal buffers for the number of input multicast queues. We further propose a hardware implementation of our algorithm for a 16 times 16 buffered crossbar switch. The implementation results suggest that MURS can run at 20 Gbps line rate and a clock cycle time of 2.8 ns, reaching an aggregate switching bandwidth of 320 Gbps. 相似文献

7.

支持多优先级分组交换调度算法研究及其调度器设计 总被引：2，自引：0，他引：2

刘化君刘斌《计算机工程与应用》2002,38(14):92-94,97

输入缓存交换结构的特点是缓存器和交换结构的运行速率与端口速率相等、实现容易,但存在队头阻塞。如果采用虚拟输出排队方法和适当的分组调度算法可予以消除,使吞吐率达到100%。文章首先研究讨论了并行迭代匹配算法,滑动迭代匹配调度算法的基本原理、迭代仲裁步骤及其硬件实现;对高速分组交换调度算法的性能进行了分析比较。然后给出了在高速输入队列交换机中实现多优先级调度算法的调度器设计与实现方案。经设计实现证明高速分组交换调度算法不仅硬件实现简单,而且具有良好的特性。相似文献

8.

Tiny Tera: a packet switch core

McKeown N. Izzard M. Mekkittikul A. Ellersick W. Horowitz M. 《Micro, IEEE》1997,17(1):26-33

Describes Tiny Tera: a small, high-bandwidth, single-stage switch. Tiny Tera has 32 ports switching fixed-size packets, each operating at over 10 Gbps (approximately the Sonet OC-192e rate, a telecom standard for system interconnects). The switch distinguishes four classes of traffic and includes efficient support for multicasting. We aim to demonstrate that it is possible to use currently available CMOS technology to build this compact switch with an aggregate bandwidth of approximately 1 terabit per second and a central hub no larger than a can of soda. Such a switch could serve as a core for an ATM switch or an Internet router. Tiny Tera is an input-buffered switch, which makes it the highest bandwidth switch possible given a particular CMOS and memory technology. The switch consists of three logical elements: ports, a central crossbar switch, and a central scheduler. It queues packets at a port on entry and optionally prior to exit. The scheduler, which has a map of each port's queue occupancy, determines the crossbar configuration every packet time slot. Input queueing, parallelism, and tight integration are the keys to such a high-bandwidth switch. Input queueing reduces the memory bandwidth requirements: When a switch queues packets at the input, the buffer memories need run no faster than the line rate. Thus, there is no need for the speedup required in output-queued switches 相似文献

9.

An efficient implementation of bypass queue under bursty traffic

《Parallel Computing》1997,23(6):777-781

In this paper we present a new bypass queue scheme for an input buffered nonblocking packet switch operating under bursty traffic. The proposed scheme uses first-in-first-out (FIFO) queues and is thus more efficient for implementation as compared to other schemes which use first-in-random-out (FIRO) queues. Maximum throughput comparison of the proposed scheme with the conventional scheme shows significant improvement. 相似文献

10.

卫星CICQ交换系统调度算法研究

下载免费PDF全文

姚晔江玉洁梁旭文《计算机工程》2012,38(21):22-25,29

联合输入交叉点排队(CICQ)交换结构由于在交叉点引入少量缓存,可以将输入端口和输出端口进行有效隔离,降低调度算法的复杂度,并适用于大容量交换。为此,研究基于交叉点缓存的各种调度算法和基于CICQ的交换结构,提出一种基于流量控制的FCSA算法,通过OPNET仿真分析表明该算法在均匀分布和突发业务源的情况下具有较好的时延性能,并且复杂度低,吞吐量大。将该算法应用于星载交换机,结果表明,该算法可以满足星载交换机多业务突发传输的特点,易于硬件实施。相似文献

11.

Synchronous versus asynchronous operation of a packet switch with combined input and output queueing

Ilias Iliadis 《Performance Evaluation》1992,16(1-3):241-250

A single-stage non-blocking N × N packet switch with combined input and output queueing is considere. The limited queueing at the output ports partially resolves output port contention. Overflow at the output queues is prevented by employment of a backpressure mechanism and additional queueing at the input ports. This paper investigates the performance of the switch under two different modes of operation: asynchronous and synchronous or slotted. For the purpose of comparison a switch model is developed. Assuming Poisson packet arrivals, several performance measures are obtained analytically. These include the distribution of the delay through the switch, the input queue length distribution, packet losses at the inputs in the case of finite input queues, and the maximum switch throughput. The results obtained demonstrate a slight performance advantage of asynchronous over synchronous operation. However, the maximum switch throughput is the same for both modes of operation. 相似文献

12.

CICQ的高性能LQF_DRR调度算法

唐权高志江《计算机工程》2011,37(7):118-120

通过研究4种经典的CICQ调度算法,提出一种高性能的LQF_DRR交换调度算法。该算法在输入端采用最长队列优先调度策略,在输出端采用DRR调度机制,通过输入端与输出端的相互配合,优先服务异常队列,以减小交换结构输入端长队列对算法性能的影响。仿真结果证明该算法在各种流量下都有良好的时延性能和稳定性。相似文献

13.

在CICQ交换结构下实现分布式的输入排队DRR分组公平调度

王荣陈越《计算机应用》2005,25(7):1488-1490,1493

传统的基于crossbar的输入排队交换结构在提供良好的QoS方面存在很大的不足,而CICQ(combined input and crosspoint buffered queuing)交换结构与传统的交换结构比,不但能在各种输入流下提供接近输出排队的吞吐率,而且能提供良好的QoS支持。基于CICQ结构,提出了在输入排队条件下实现基于流的分布式DRR分组公平调度算法的方案,并通过仿真验证了这一方案的有效性。相似文献

14.

iLQF调度算法及其参数的仿真 总被引：1，自引：0，他引：1

李立峰郭东辉《计算机与网络》2000,(6):26-26,28

文章介绍一种无内部阻塞ATM交换结构的输入缓存模型及其缓存信元的iLQF调度算法(即迭代的最长队列优先调度算法),并具体给出该算法的实现方案。通过仿真,分析和确定了交换结构的输入缓存长度和iLQF调度算法的迭代次数对信元丢失率的影响。相似文献

15.

一种基于输入队列的交换机快速会聚调度算法 总被引：1，自引：0，他引：1

刘东钢侯紫峰《计算机工程与应用》2002,38(1):150-153,190

随着网络带宽需求的增加,高性能交换机的地位日趋重要。交换机包括3个部分:(1)在输入端口保存到达此端口的信元的输入缓冲。(2)在输出端口保存将要发送的信元的输出缓冲。(3)调度输入信元到所需输出端口的调度模块。当由多个输入端口要求输出到同一输出端口的时候由此调度算法来裁决一个输入输出对。一般而言,交换机的性能很大一部分取决于这一调度算法的性能,但并不希望这一调度算法成为交换机性能的瓶颈。该文讨论了许多近年来常用的算法,在此基础上同时提出一种新的的调度算法。通过计算机模拟结果可以看出这种算法具有更高的效率,更快的会聚速度。相似文献

16.

Priority queues and sorting methods for parallel simulation

Grammatikakis M.D. Liesche S. 《IEEE transactions on pattern analysis and machine intelligence》2000,26(5):401-422

The authors examine the design, implementation, and experimental analysis of parallel priority queues for device and network simulation. They consider: 1) distributed splay trees using MPI; 2) concurrent heaps using shared memory atomic locks; and 3) a new, more general concurrent data structure based on distributed sorted lists, designed to provide dynamically balanced work allocation and efficient use of shared memory resources. We evaluate performance for all three data structures on a Cray-TSESOO system at KFA-Julich. Our comparisons are based on simulations of single buffers and a 64×64 packet switch which supports multicasting. In all implementations, PEs monitor traffic at their preassigned input/output ports, while priority queue elements are distributed across the Cray-TBE virtual shared memory. Our experiments with up to 60000 packets and two to 64 PEs indicate that concurrent priority queues perform much better than distributed ones. Both concurrent implementations have comparable performance, while our new data structure uses less memory and has been further optimized. We also consider parallel simulation for symmetric networks by sorting integer conflict functions and implementing a packet indexing scheme. The optimized message passing network simulator can process ~500 K packet moves in one second, with an efficiency that exceeds ~50 percent for a few thousand packets on the Cray-T3E with 32 PEs. All developed data structures form a parallel library. Although our concurrent implementations use the Cray-TSE ShMem library, portability can be derived from Open-MP or MP1-2 standard libraries, which will provide support for one-way communication and shared memory lock mechanisms 相似文献

17.

(TCBF):CICQ中一种复杂度低的高效算法

周功业李成俊《小型微型计算机系统》2007,28(6):1144-1148

随着芯片集成度快速提高,带有交叉缓存(crosspoint buffer)的cicq(Combined Input and Crossbar Queued switch)交换机引起了人们的广泛关注和研究兴趣.分析了cicq模型的性质后,提出了在cicq中瞬时加速比和有效加速比的概念,指出瞬时加速比也是一种资源.分析了输入队列和输出队列对cicq系统性能的影响,在此基础上提出了基于cicq的输入仲裁算法-TCBF(Thresh-hold Crosspoint Buffer First),并进行了仿真实验,针对非均匀业务模式(nonuniform)的情况作了改进.实验证明,这种时间复杂度为O(1)的算法有100%的吞吐率,有和OQ-N近似的输出延迟.并且非常易于硬件实现.最后讨论硬件实现办法. 相似文献

18.

基于Virtual Output Queued交换结构的最大权重匹配算法

鄂大伟《计算机工程与应用》2001,37(18):66-69

信头阻塞（HOL）限制了采用FIFO输入队列交换机的吞吐率,而使用虚输出队列（VOQ）技术可以完全消除HOL阻塞。文章给出了VOQ的交换机模型,介绍了基于最大权重匹配的算法LQF、OCF、LPF及其性能,还描述了更加实用的并行迭代算法i－LQF、i－OCF和i－LPF。文章的结论对于构造高带宽的交换机具有实际意义。相似文献

19.

缓冲交叉开关交换结构多播调度算法研究 总被引：1，自引：0，他引：1

孙书韬贺思敏郑燕峰高文《计算机研究与发展》2006,43(6):1036-1043

高性能核心交换设备多播调度受到越来越多的关注·交叉开关结构下的多播调度方案或者性能较差,或者过于复杂,难于应用在高速交换场合·为此,提出一种面向多播的多输入队列缓冲交叉开关体系结构·将多播调度分解为信元分派、输入调度、输出调度3个可分布式并行执行的子问题,并设计了相应的调度算法,降低了算法复杂性·实验结果表明,交叉点缓冲区容量与输入队列数量对多播性能都具有很大的影响·在突发流量到达下,与单多播输入队列的体系结构相比,无论是采用O(1)复杂度的HA-RR-RR还是复杂度更高的调度算法,均能显著提高系统吞吐性能· 相似文献

20.

Maximizing throughput in multi-queue switches

Yossi Azar Arik Litichevskey 《Algorithmica》2006,45(1):69-90

We study a basic problem in Multi-Queue switches. A switch connectsm input ports to a single output port. Each input port is equipped with an incoming FIFO queue with bounded capacityB. A switch serves its input queues by transmitting packets arriving at these queues, one packet per time unit. Since the arrival rate can be higher than the transmission rate and each queue has limited capacity, packet loss may occur as a result of insufficient queue space. The goal is to maximize the number of transmitted packets. This general scenario models most current networks (e.g. IP networks) which only support a “best effort” service in which all packet streams are treated equally. A 2-competitive algorithm for this problem was designed in [5] for arbitraryB. Recently, a (17/9 ≈ 1.89)-competitive algorithm was presented forB>1 in [3]. Our main result in this paper shows that forB which is not too small our algorithm can do better than 1.89, and approach a competitive ratio ofe/(e − 1) ≈ 1.58. The research of Yossi Azar was supported in part by the Israeli Ministry of Industry and Trade and by the Israel Science Foundation. 相似文献