期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

扈红超伊鹏郭云飞《软件学报》2008,19(4):1036-1050

仿真实验已成为交换结构和调度策略性能评价的重要手段,而目前存在的交换结构与调度策略的仿真软件在可继承性与可扩展性方面还存在缺陷.基于Crossbar交换结构,建立数学模型,引入系统级设计方法,采用面向对象技术,设计并实现了用于研究交换结构和调度策略的仿真平台——SPES(switching performance evaluation system).该平台集成了输入排队、输出排队、联合输入输出排队、联合输入交叉点排队等多种交换结构以及相应调度策略.设计上实现了业务流、交换结构和调度策略三者之间的分离,具有良好的可继承、可扩展特性.用户通过与仿真平台之间的简单交互,完成模块的添加与仿真环境参数的配置,在支持变长业务、区分服务质量模型和多交换平面仿真方面具有良好的特性.通过简单扩展。该平台还可以实现网络级性能仿真.最后给出了基于该平台,在CICQ(combined input and crosspoint queuing)交换结构下,对所提出的支持DiffServ模型的分布式调度策略DS(DiffServ supporting algorithm)在不同业务流模型下的性能测试结果,并与输入、输出排队交换结构进行了比较,展示了DS良好的性能,验证了仿真平台的合理性. 相似文献

2.

一种公平服务的动态轮询调度算法 总被引：6，自引：0，他引：6

扈红超伊鹏郭云飞李玉峰《软件学报》2008,19(7):1856-1864

调度策略是核心路由交换设备性能的重要保证.针对联合输入交叉节点排队(combined input and cross-point queuing,简称CICQ)交换结构现有调度策略在复杂度或性能方面存在的缺陷,深入探讨了CICQ交换结构调度策略设计的基本准则,并提出了CICQ下虚拟通道的概念.基于基本准则和虚拟通道概念,提出一种简单、高效和公平服务的动态轮询调度策略——FDR(fair service and dynamic round robin).其算法复杂度为O(1),具有良好的可扩展性;并依据虚拟通道的状态为其分配调度份额,具有良好的动态实时性能,能够适应流量负载非均衡的网络环境.SPES(switching performance evaluation systcm)仿真结果表明,该算法具有良好的时延、吞吐量和抗突发性能. 相似文献

3.

支持优先级的高速交换开关缓冲队列分析与设计 总被引：3，自引：1，他引：3

杨玉海宾雪莲郑玉墙《计算机工程与应用》2003,39(1):128-131

缓冲队列在交换开关中起着重要的作用。为了使交换开关的调度更加灵活有效,可以采用硬件方法实现缓冲排队。为了支持QOS,还有必要设计一种支持优先级的排队机制。该文在研究已有的缓冲排队方案基础上,提出一种支持优先级FIFO排队的硬件体系结构PFQ(PriorityFIFOQueue)。PFQ借鉴了移位寄存器的基本思想,按链表形式组织每个FIFO队列,通过设置高速局部总线,有效解决了头信元阻塞问题。使得采用PFQ的交换开关可以实现更加灵活的调度算法。模拟结果表明PFQ具有灵活、高效、硬件代价较低并且实现简单的优点。相似文献

4.

高性能路由器分组调度算法研究

江勇吴建平徐明伟《软件学报》2002,13(4):621-628

Internet同时面临着两个问题:更快的交换路由结构和引入服务质量(QoS)保证.每个问题都可以独立解决.高性能路由器可以用输入缓冲的交叉开关(crossbar)代替共享内存来获得更快的速度;QoS能够通过分组公平排队算法PFQ(packet fair queuing)来得到.然而到目前为止,这两个问题的解决还是互斥的--所有的分组公平排队算法研究都需要路由器采用输出排队或者集中式共享内存.基于输入输出结合排队CIOQ(combined input output queuing)结构,设计和实现了一种分相似文献

5.

iRSDRR:一种全异步的基于输入排队Crossbar交换结构的调度算法 总被引：1，自引：0，他引：1

严敬邱智亮杨君刚《计算机工程与应用》2005,41(11):135-138

DRR(DualRound-Robin)[1]调度算法是一种公平、高效、硬件实现简单的基于输入排队Crossbar交换结构的信元调度算法。为了进一步改善这种算法的性能,该文提出了一种全异步的多次迭代DRR算法,即iRSDRR(iterativeRo-tatingStaticDualRound-Robin)。该算法在开始时,将所有的输入、输出仲裁器的指针全部设置为异步的,以后每个时隙静态地更新所有的仲裁器的指针。仿真结果表明该算法在不同业务流条件下的性能都优于DRR调度算法。相似文献

6.

基于输入排队的高速交换调度算法研究 总被引：2，自引：0，他引：2

张重洋申金媛刘润杰张文英穆维新《智能系统学报》2008,3(3)

高速交换网络一般采用基于定长信元的交换结构,其性能决定于排队策略和信元调度算法．输入排队策略只有和一个有效的调度算法相结合,才能保证交换结构具有良好的吞吐率和时延等性能．主要阐述了基于VOQ的最大数量匹配算法,最大权重匹配算法,稳定结合算法,神经网络算法等输入排队调度算法,分别从技术特点,性能指标和实现复杂度等多个方面进行比较和分析．分析了分布式和集中式两大类调度算法的工作方式,并根据各类算法的特点提出,神经网络算法可以通过定义其优先级函数实现其余各类算法．相似文献

7.

一种具有信元保序能力的Clos网络分布式调度算法 总被引：1，自引：0，他引：1

杨君刚鲍民权刘增基邱智亮赵瑞琴石增增《计算机学报》2008,31(3):467-475

分组交换三级Clos网络信元调度算法可分为集中式和分布式两种实现方式.分布式调度具有良好的可扩展性,适于在高速大容量环境中应用.然而由于分布式调度会带来同一分组各个信元间的乱序问题,给其实现带来困难.该文提出了一种具有信元保序能力的三级Clos网络分布式调度算法.该算法包括第一级的均匀负载分配、中间级的并行调度和第三级的按序输出调度三部分.文中对算法的性能进行了严格的理论证明和相关的仿真分析,表明该算法可以很好地解决传统分布式调度中的信元乱序问题,具有良好的性价比. 相似文献

8.

一种支持DiffServ模型的全分布式调度算法 总被引：1，自引：0，他引：1

伊鹏扈红超于婧汪斌强《软件学报》2008,19(7):1847-1855

调度算法设计对于网络路由设备实现区分服务(DiffServ)模型的单跳行为(per hop behavior,简称PHB)至关重要.现有支持DiffServ模型的调度算法普遍基于输出排队(output queued,简称OQ)或是输入排队(input queued,简称IQ)交换结构进行设计,均无法在高速环境下提供高性能的调度.基于联合输入/交叉节点排队(combinedinput-crosspoint-queued,简称CICQ)交换结构提出一种支持DiffServ模型的全分布式调度算法DDSS (distributed DiffServ supporting scheduling),并通过理论分析对其公平性进行了验证.DDSS算法采用基于预约带宽的逐级流量控制机制实现所有预约带宽在快速转发(expedited forwarding,简称EF)业务与确保转发(assured forwarding,简称AF)业务之间的分配,采用优先级调度机制为EF业务提供低延迟服务,算法复杂度为O(log N).仿真结果表明,DDSS算法具有良好的时延性能和公平特性,与现有算法相比,能够更好地支持DiffServ模型. 相似文献

9.

高速IP路由器中输入排队调度算法综述 总被引：8，自引：1，他引：8

下载免费PDF全文

庞斌贺思敏高文《软件学报》2003,14(5):1011-1022

高速IP路由器一般采用基于定长信元的交换结构,其可扩展性和性能分别受排队策略和调度算法的影响.基于输入排队策略的路由器具有良好的可扩展性,但需要一个有效的调度算法的支持,才能保证吞吐率和延迟等性能.主要讨论输入排队调度算法,将现有的调度算法分为4类:最大(无权重)匹配、最大权重匹配、稳定婚姻匹配和确定型调度.对每一类算法,从技术特点和性能指标两个方面进行比较和分析.最后给出了输入排队调度算法的发展趋势. 相似文献

10.

在CICQ交换结构下实现分布式的输入排队DRR分组公平调度

王荣陈越《计算机应用》2005,25(7):1488-1490,1493

传统的基于crossbar的输入排队交换结构在提供良好的QoS方面存在很大的不足，而CICQ(combined input and crosspoint buffered queuing)交换结构与传统的交换结构比，不但能在各种输入流下提供接近输出排队的吞吐率，而且能提供良好的QoS支持。基于CICQ结构，提出了在输入排队条件下实现基于流的分布式DRR分组公平调度算法的方案，并通过仿真验证了这一方案的有效性。相似文献

11.

Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk

Michael A. Bender Michael O. Rabin 《Theory of Computing Systems》2002,35(3):289-304

We study the problem of executing parallel programs, in particular Cilk programs, on a collection of processors of different speeds. We consider a model in which each processor maintains an estimate of its own speed, where communication between processors has a cost, and where all scheduling must be online. This problem has been considered previously in the fields of asynchronous parallel computing and scheduling theory. Our model is a bridge between the assumptions in these fields. We provide a new more accurate analysis of an old scheduling algorithm called the maximum utilization scheduler . Based on this analysis, we generalize this scheduling policy and define the high utilization scheduler . We next focus on the Cilk platform and introduce a new algorithm for scheduling Cilk multithreaded parallel programs on heterogeneous processors. This scheduler is inspired by the high utilization scheduler and is modified to fit in a Cilk context. A crucial aspect of our algorithm is that it keeps the original spirit of the Cilk scheduler. In fact, when our new algorithm runs on homogeneous processors, it exactly mimics the dynamics of the original Cilk scheduler. 相似文献

12.

The Inherent Queuing Delay of Parallel Packet Switches

《Parallel and Distributed Systems, IEEE Transactions on》2006,17(9):1048-1056

The parallel packet switch (PPS) extends the inverse multiplexing architecture and is widely used as the core of contemporary commercial switches. This paper investigates the inherent queuing delay introduced by the PPS's demultiplexing algorithm, responsible for dispatching cells to the middle-stage switches, relative to an optimal work-conserving switch. We first consider an Ntimes N PPS without buffers in its input ports, operating at external rate R, internal rate r相似文献

13.

Parallel Switch System with QoS Guarantee for Real-Time Traffic 总被引：1，自引：0，他引：1

下载免费PDF全文

Wen-Jie Li Bin Liu Yang Xu and Heng Liao 《计算机科学技术学报》2006,21(6):1012-1021

This paper studies the load-balancing algorithm and quality of service （QoS） control mechanism in a 320Gb/s switch system, which incorporates four packet-level parallel switch planes. Eight priorities for both unicast and multicast traffic are implemented, and the highest priority with strict QoS guarantee is designed for real-time traffic. Through performance analysis under multi-prlorlty burst traffic, we demonstrate that the load-balancing algorithm is efficient, and the switch system not only provides excellent performance to real-time traffic, but also efficiently allocates bandwidth among other traffic of lower priorities. As a result, this parallel switch system is more scalable towards next generation core routers with QoS guarantee, as well as ensures in-order delivery of IP packets. 相似文献

14.

Scheduling multicast traffic in input-buffered ATM switch

《Computer Communications》2001,24(15-16):1607-1617

Performance of an input-buffered ATM switch is limited by the head-of-line (HOL) blocking problem. HOL blocking is even more pronounced in multicast switch where cells compete for multiple outputs simultaneously. Previous studies in input-buffered unicast switch have shown that HOL blocking can only be eliminated by using per-output queuing and sophisticated scheduling methods, such as the maximal weight matching (MWM) or the parallel iterative matching (PIM) methods. The MWM or PIM types of scheduling algorithm cannot be applied to multicast switch because of the high computation complexity. In this paper, we present a reservation based scheduling algorithm, which employs per-VC queuing for multicast connections and per-output queuing for unicast connections. Instead of the input ports sending a huge amount of state information to the output ports for processing, we circulate reservation vectors amongst the input ports. Each input port will then make reservation based on its local state and the availability of the output ports. The scheduling is done on a frame by frame basis. While the input ports are transmitting cells according to the schedule of the current frame, the next frame schedule is computed. Simulations reveal that our method substantially outperforms the methods that employ FIFO queuing discipline. 相似文献

15.

Distributed fair DRAM scheduling in network-on-chips architecture

《Journal of Systems Architecture》2013,59(7):543-550

Memory access scheduling is an effective manner to improve performance of Chip Multi-Processors (CMPs) by taking advantage of the timing characteristics of a DRAM. A memory access scheduler can subdivide resources utilization (banks and rows) to increase throughput by accessing different DRAM banks in parallel. However, different threads running on different cores may exhibit different performance. One thread may experience starvation while the others are serviced normally. Therefore, designing a scheduler which reduces the unfairness in the DRAM system, while also improving system throughput on a variety of workloads and systems, is necessary. In this paper, a distributed fair DRAM scheduling for two-dimensional mesh network-on-chips (NoCs), called DFDS, is presented. The key design points in DFDS are: (i) assessing the total waiting cycles of a memory request in NoC and considering it as a metric in arbitration. For this purpose waiting cycles of a memory request are put in an additional flit in a packet and are updated while traversing the NoC, and (ii) proposing a semi-dynamic virtual channel allocation to provide in-order memory requests to memory controllers (MCs). Consequently, we use a simple scheduling algorithm in MCs, instead of complex algorithms. To validate our approach, we apply synthetic and real workload from Parsec benchmark suite. The results show effectiveness of our approach, as we reduce the waiting time of memory requests by up to 15%. 相似文献

16.

Dynamic scheduling of a batch of parallel task jobs on heterogeneous clusters

Jorge G. Barbosa Belmiro Moreira 《Parallel Computing》2011,37(8):428-438

This paper addresses the problem of minimizing the scheduling length (make-span) of a batch of jobs with different arrival times. A job is described by a direct acyclic graph (DAG) of parallel tasks. The paper proposes a dynamic scheduling method that adapts the schedule when new jobs are submitted and that may change the processors assigned to a job during its execution. The scheduling method is divided into a scheduling strategy and a scheduling algorithm. We also propose an adaptation of the Heterogeneous Earliest-Finish-Time (HEFT) algorithm, called here P-HEFT, to handle parallel tasks in heterogeneous clusters with good efficiency without compromising the makespan. The results of a comparison of this algorithm with another DAG scheduler using a simulation of several machine configurations and job types shows that P-HEFT gives a shorter makespan for a single DAG but scores worse for multiple DAGs. Finally, the results of the dynamic scheduling of a batch of jobs using the proposed scheduler method showed significant improvements for more heavily loaded machines when compared to the alternative resource reservation approach. 相似文献

17.

主从交换式以太网中跨多Switch传输消息实时调度研究

檀明《计算机工程与科学》2015,37(10):1862-1868

针对FTT-SE协议在单Master多交换机的网络扩展结构中存在的消息跨多Switch传输调度问题,给出了消息在每个基本调度周期内到达各交换机输出端口时间的计算方法,提出了单EC内的消息可调度性判定算法,并对算法的可行性进行了证明。在此基础上,设计了基于EDF的消息实时调度算法和准入控制算法。通过确定消息在每个基本调度周期内到达各交换机输出端口时间,所提出的调度算法能针对COTS交换机输出端口的FCFS消息传输机制,实现对单EC内消息传输的精确控制和调度。相对已有的调度算法,仿真实验表明,所提出的算法能更有效地利用网络带宽,提高了主从交换式以太网通信的实时性。相似文献

18.

PPS的集中和分布式并行分组交换算法分析

李玉峰兰巨龙《计算机工程》2004,30(24):37-39

集中式并行分组交换算法（Centratized Parallel Packet Switch Algorithm,CPA)和分布式并行分组交换算法（Distribntd Parallel Packet Switch Algoritlun,DPA)是目前并分行分组交换（Parallel Packet Switch,PPS研究中的典型算法，该文对两种算法进行了描述及理论分析和性能比较，作出了两种算法的应用性分析，探讨了DPA算法实现需要继续研究和解决的几个关键问题。相似文献

19.

一种基于资源预分配的虚拟机软实时调度方法 总被引：1，自引：0，他引：1

下载免费PDF全文

丁晓波马中戴新发黄伟华《计算机工程与科学》2015,37(5):865-872

虚拟机技术作为云计算的重要技术之一,近年来得到广泛关注,但是由于虚拟机管理层的存在,导致语义鸿沟,使得实时应用程序、并发程序等在虚拟机上的运行性能受到影响。分析和研究了Xen虚拟机管理器的Credit调度算法,针对其在并发调度和软实时调度方面存在的不足,提出了改进调度算法,实现了算法的调度器原型。新的调度算法对软实时虚拟机进行Credit比例预分配,采用动态调度时间片机制,以non-work-conserving方式实现软实时任务周期调度,保障调度周期满足运行周期要求。通过区分并发和非并发软实时虚拟机,采取不同的调度策略,在满足资源利用率的基础上,确保实时任务的顺利运行。测试结果表明,该调度算法在对并发和非并发软实时任务调度上,具有良好的表现,较好满足了软实时应用调度需求。相似文献

20.

Crossbar调度器的设计和实现 总被引：2，自引：0，他引：2

下载免费PDF全文

孙志刚赵国鸿卢锡城《计算机工程与科学》2001,23(2):14-16

宽带网络交换设备常采用交换开关crossar作为内部交换阵列,集中的crossbar调度器是crossbar高效工作的关键,ISP ）Input Serial Polling)是一种简单高效的crossbar调度算法,本文详细介绍了ISP调度器的设计和实现。相似文献