期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Narasimha Reddy A.L. Wyllie J.C. 《Computer》1994,27(3):69-74

In future computer system design, I/O systems will have to support continuous media such as video and audio, whose system demands are different from those of data such as text. Multimedia computing requires us to focus on designing I/O systems that can handle real-time demands. Video- and audio-stream playback and teleconferencing are real-time applications with different I/O demands. We primarily consider playback applications which require guaranteed real-time I/O throughput. In a multimedia server, different service phases of a real-time request are disk, small computer systems interface (SCSI) bus, and processor scheduling. Additional service might be needed if the request must be satisfied across a local area network. We restrict ourselves to the support provided at the server, with special emphasis on two service phases: disk scheduling and SCSI bus contention. When requests have to be satisfied within deadlines, traditional real-time systems use scheduling algorithms such as earliest deadline first (EDF) and least slack time first. However, EDF makes the assumption that disks are preemptable, and the seek-time overheads of its strict real-time scheduling result in poor disk utilization. We can provide the constant data rate necessary for real-time requests in various ways that require trade-offs. We analyze how trade-offs that involve buffer space affect the performance of scheduling policies. We also show that deferred deadlines, which increase buffer requirements, improve system performance significantly 相似文献

2.

Real-time scheduling of divisible loads in cluster computing environments

Xuan LinAuthor Vitae Anwar MamatAuthor VitaeYing Lu Jitender DeogunAuthor Vitae Steve Goddard 《Journal of Parallel and Distributed Computing》2010

Cluster computing has become an important paradigm for solving large-scale problems. To enhance the quality of service (QoS) and provide performance guarantees in a cluster computing environment, various real-time scheduling algorithms and workload models have been investigated. Computational loads that can be arbitrarily divided into independent tasks represent many real-world applications. However, the problem of providing performance guarantees to divisible load applications has only recently been studied systematically. In this work, three important and necessary design decisions, (1) workload partitioning, (2) node assignment, and (3) task execution order, are identified for real-time divisible load scheduling. A scheduling framework that can configure different policies for each of the three design decisions is proposed and used to generate various algorithms. This paper systematically studies these algorithms and identifies scenarios where the choices of design parameters have significant effects. 相似文献

3.

性能非对称多核处理器下异构感知调度技术

赵姗杨秋松李明树《软件学报》2019,30(4):1164-1190

为了满足应用程序的多样化需求,异构多核处理器出现并逐渐进入市场,其中的处理核心（core）具有不同的微架构或者指令集架构（ISA）,为应用提供多样化特性支持,比如指令级并行（ILP）、内存级并行（MLP）,这些核心协同工作满足整个计算系统的优化目标,比如高性能、低功耗或者良好的能效.然而,目前主流的调度技术主要是针对传统同构处理器架构设计,没有考虑异构硬件能力的差异性.在异构多核处理器环境下,调度技术如何感知硬件的异构特性,为不同类型的应用程序提供更加合适和匹配的硬件资源,这是值得探索的问题.对近年来在该研究领域的成果进行了综述研究,特别是在性能非对称多核处理器架构下,异构调度技术面临的优化目标、分析模型、调度决策和算法评估等主要问题进行了分析和描述,并依次对相关技术进行了系统的总结,最后从软硬件融合的角度对今后的研究工作进行了展望. 相似文献

4.

一种支持DiffServ模型的CICQ调度策略

下载免费PDF全文

李印海扈红超郭云飞《计算机工程》2007,33(21):108-110

结合大规模接入汇聚路由器需要对不同汇聚业务流进行不同的处理这一实际需求,基于CICQ交换结构,该文给出了一种支持DiffServ模型的调度策略(DS),该算法以“节点行为”方式对业务流进行调度。和以往算法相比,DS采取了分布式的控制策略,并且具有较低的时间复杂度,工程上更易实现。仿真结果表明,DS不仅能够为EF和AF业务提供带宽保证,而且具有良好的时延性能。相似文献

5.

A scheduling discipline and admission control policy for xunet 2

Huzur Saran Srinivasan Keshav Charles R. Kalmanek 《Multimedia Systems》1994,2(3):118-128

Xunet 2 is a collaborative research program with a goal of understanding the fundamental issues in the performance of ATM networks. These networks are expected to carry a mixture of constant bit-rate traffic, variable bit-rate traffic and computer traffic spanning a wide range of performance requirements. This paper describes these service requirements and matches them with performance guarantees that can be provided by the scheduling discipline supported by an experimental ATM switch. The scheduler supports per-virtual-circuit queueing and several priorities of round robin service in order to segregate real-time and non-real-time traffic and provide fair sharing for bursty computer traffic. Detailed simulations show that real-time traffic can be efficiently integrated with non-real-time traffic using appropriate call admission policies and enhancements to traditional round robin scheduling. While the present study focuses on providing quality of service guarantees in the Xunet 2 network, the design of the scheduler and the call admission policies are relevant to ATM networks in general. On leave of: Indian Institute of Technology, Delhi, India 相似文献

6.

Performance Measures for Evaluating Algorithms for SIMD Machines

《IEEE transactions on pattern analysis and machine intelligence》1982,(4):319-331

This paper examines measures for evaluating the performance of algorithms for single instruction stream–multiple data stream (SIMD) machines. The SIMD mode of parallelism involves using a large number of processors synchronized together. All processors execute the same instruction at the same time; however, each processor operates on a different data item. The complexity of parallel algorithms is, in general, a function of the machine size (number of processors), problem size, and type of interconnection network used to provide communications among the processors. Measures which quantify the effect of changing the machine-size/problem-size/network-type relationships are therefore needed. A number of such measures are presented and are applied to an example SIMD algorithm from the image processing problem domain. The measures discussed and compared include execution time, speed, parallel efficiency, overhead ratio, processor utilization, redundancy, cost effectiveness, speed-up of the parallel algorithm over the corresponding serial algorithm, and an additive measure called "sprice" which assigns a weighted value to computations and processors. 相似文献

7.

Achieving fair service with a hybrid scheduling scheme for CICQ switches

HU HongChao GUO YunFei YI Peng & LAN JuLong 《中国科学:信息科学(英文版)》2012,(3):689-700

Providing performance guarantees for arriving traffic flows has become an important measure for today’s routing and switching systems. However, none of current scheduling algorithms built on CICQ (combined input and cross-point buffered) switches can provide flow level performance guarantees. Aiming at meeting this requirement, the feasibility of implementing flow level scheduling is discussed thoroughly. Then, based on the discussion, it comes up with a hybrid and stratified fair scheduling (HSFS) scheme, which is hierarchical and hybrid, for CICQ switches. With HSFS, each input port and output port can schedule variable length packets independently with a complexity of O(1). Theoretical analysis show that HSFS can provide delay bound, service rate and fair performance guarantees without speedup. Finally, we implement HSFS in SPES (switch performance evaluation system) to verify the analytical results. 相似文献

8.

支持公平服务的CICQ分层混合调度策略

扈红超郭云飞伊鹏兰巨龙《中国科学:信息科学》2012,(4):410-422

为到达业务提供性能保障是衡量一个交换系统性能的重要参考.针对现有联合输入交叉点排队交换结构(CICQ)调度策略缺乏基于流的服务质量保障,探讨了在CICQ交换结构实施基于"流"调度的可能性,提出了一种能够为到达业务流的提供公平服务的分层混合调度策略(HSFS).HSFS采用分层的混合调度机制,每个输入、输出端口可独立地进行变长分组交换,其复杂度为O(1),具有良好可扩展特性.理论分析结果表明,HSFS无需加速便能为到达业务提供时延上限、速率和公平性保障.最后,基于SPES对HSFS的性能进行了评估. 相似文献

9.

A performance study of uplink scheduling algorithms in point-to-multipoint WiMAX networks

Najah Abu Ali Pratik Dhrona Hossam Hassanein 《Computer Communications》2009,32(3):511-521

The IEEE 802.16 standard defines the specifications for medium access control (MAC) and physical (PHY) layers of WiMAX networks. A critical part of the MAC layer specification is packet scheduling, which resolves contention for bandwidth and determines the transmission order of users. Evaluating the performance packet scheduling algorithms is of utmost importance towards realizing large-scale WiMAX deployment. In this paper, we conduct a comprehensive performance study of scheduling algorithms in point-to-multipoint mode of OFDM-based WiMAX networks. We first make a classification of WiMAX scheduling algorithms, then simulate a representative number of algorithms in each class taking into account that vital characteristics of the IEEE 802.16 MAC layer and OFDM physical layer. We evaluate the algorithms with respect to their abilities to support multiple classes of service, providing quality of service (QoS) guarantees, fairness amongst service classes and bandwidth utilization. To the best of our knowledge, no such comprehensive performance study has been reported in the literature. Simulation results indicate that none of the current algorithms is capable of effectively supporting all WiMAX classes of service. We demonstrate that an efficient, fair and robust scheduler for WiMAX is still an open research area. We conclude our study by making recommendations that can be used by WiMax protocol designers. 相似文献

10.

Speeding up high-speed protocol processors

Serpanos D.N. 《Computer》2004,37(9):108-111

Many network technologies aim to exploit the bandwidth of high-speed links, which now achieve data transfer rates up to several terabits per second. As packet interarrival times shrink to a few tens of nanoseconds, network systems must address a transmission-processing gap by providing extremely fast data paths as well as high-performance subsystems to implement such functions as protocol processing, memory management, and scheduling. Today, network processors are an important class of embedded processors, used all across the network systems space-from personal to local and wide area networks. Network processor architectures focus on exploiting parallelism to achieve high performance. They usually employ conventional architectural concepts to accelerate the processing required to switch packets between different protocol stacks. The architectures support the mechanisms that network protocols implement in a specific stack by providing efficient data paths and by executing many intelligent network or more homogeneous links - for example, a set of Ethernet links. Although network processors can also handle packets concurrently from different protocol stacks, we describe only single-stack processing here. However, the arguments and results extend to a multistack environment. 相似文献

11.

传感器网络中协作任务的实时调度

胡侃刘云生《计算机科学》2007,34(10):65-69

在传感器网络实时监测应用中,大量传感器散布在监测区域中感知监测域的各种环境或监测对象的信息,一组功能有限的传感器往往相互协作地完成一个大的实时感知任务,协作性是传感器网络的重要特性,它要求实时任务之间的资源共享。单纯的实时系统为保证任务的实时性通常采用资源隔离机制而不能很好地解决传感器网络环境中的采集流数据处理的协作性问题。本文基于服务器的调度框架,使用了实时环境中的时间属性,将数据时间与程序时间相结合,从而将应用语义与系统中运行的程序相联系,提出了一种基于时间依赖关系的实时调度模式,并给出了基于此模式的事件驱动并发数据流程图模型及其实现机制。分析表明,该模型能有效地解决传感器网络监测区域中采集流数据处理过程的协作性问题,减少了数据丢失,提高了系统响应的实时性。相似文献

12.

Providing QOS guarantees for disk I/O 总被引：1，自引：0，他引：1

Ravi Wijayaratne A.L. Narasimha Reddy 《Multimedia Systems》2000,8(1):57-68

In this paper, we address the problem of providing different levels of performance guarantees or quality of service for disk I/O. We classify disk requests into three categories based on the provided level of service. We propose an integrated scheme that provides different levels of performance guarantees in a single system. We propose and evaluate a mechanism for providing deterministic service for variable-bit-rate streams at the disk. We will show that, through proper admission control and bandwidth allocation, requests in different categories can be ensured of performance guarantees without getting impacted by requests in other categories. We evaluate the impact of scheduling policy decisions on the provided service. We also quantify the improvements in stream throughput possible by using statistical guarantees instead of deterministic guarantees in the context of the proposed approach. 相似文献

13.

动态抢占阈值调度中的快速任务选择算法

下载免费PDF全文

贺小川贾焰《计算机工程与科学》2008,30(12):51-54

基于动态抢占阈值的实时调度算法集非抢占调度和纯抢占调度的特点,既减少了由于过多的随意抢占造成的CPU资源浪费,又保证了较高的CPU资源利用率。然而,现有的任务选择算法运行时的额外代价严重影响了系统的整体性能。针对这个问题,本文提出一种使用“选择树”作为任务队列结构的、时间复杂度为O（｜log2n｜）的快速任务选择算法。本文从理论上证明该算法正确性的同时,在使用ARM9芯片的Nokia智能手机上验证了该算法在嵌入式实时系统中的有效性。实验表明,该算法在充分利用处理器的同时能够有效降低动态阈值调度算法的额外代价。相似文献

14.

A new design for end-to-end proportional loss differentiation in IP networks

Pablo J. Argibay-Losada Andrés Suárez-González Cándido López-García Manuel Fernández-Veiga 《Computer Networks》2010,54(9):1389-1403

This paper describes the algorithms and the architecture of a network able to provide end-to-end proportional packet loss probabilities at the flow level. We show that the combination of a simple classification technique at the sources, and a network core having two internal service classes, is sufficient to achieve proportional service without the need to deploy coordinated, complex per-hop scheduling schemes or signaling protocols, which is the conventional approach. The proposed architecture is complementary to any differentiation algorithm used by the routers. Our results show that any network endowed with some internal service classes with respect to packet loss probabilities can be exploited to build a set of external service classes with end-to-end and per-flow guarantees. 相似文献

15.

下一代网络处理器及应用综述

赵玉宇程光刘旭辉袁帅唐路《软件学报》2021,32(2):445-474

网络处理器作为能够完成路由查找、高速分组处理以及QoS保障等主流业务的网络设备核心计算芯片,可以结合自身可编程性完成多样化分组处理需求,适配不同网络应用场景.面向超高带宽及智能化终端带来的网络环境转变,高性能可演进的下一代网络处理器设计是网络通信领域的热点问题,受到学者们的广泛关注.融合不同芯片架构优势、高速服务特定业... 相似文献

16.

一种基于模糊聚类的网格DAG任务图调度算法 总被引：19，自引：2，他引：19

下载免费PDF全文

杜晓丽蒋昌俊徐国荣丁志军《软件学报》2006,17(11):2277-2288

针对网格环境中,任务调度的目标系统具有规模庞大、分布异构和动态性等特点,提出一种基于模糊聚类的网格异构任务调度算法.以往的很多调度算法需要在调度的每一步遍历整个目标系统,虽然能够获得较小的makespan,但是无疑增加了整个调度的Runtime.定义了一组刻画处理单元综合性能的特征,利用模糊聚类方法对目标系统(处理单元网络)进行预处理,实现了对处理单元网络的合理划分,使得在任务调度时能够较准确地优先选择综合性能较好的处理单元聚类,从而缩小搜索空间,大量减少任务调度时选择处理单元的时间耗费.此外,就绪任务优先级的构造既隐含考虑了关键路径上节点的执行情况对整个程序执行的影响,又考虑了异构资源对任务执行的影响.实验及性能分析比较的结果表明,定义的处理器特征能够实现对处理器网络的合理划分,而且随着目标系统规模的增大,所提出的算法优越性越来越明显. 相似文献

17.

面向网络报文转发的RISC-V压缩指令定制

吕倩茹王彦鹏曹壮文梅《计算机工程与科学》2018,40(3):381-387

指令流发射和指令Cache失效是处理器能量耗散的两个重要原因。松耦合的RISC指令集所产生的程序加剧了这样的能耗,而在片上Cache有限的网络设备如路由器、交换机中,因为指令流而遭受的性能下降和功耗增加更为严重。面向网络报文转发这一重要的网络功能服务,分析了网络报文转发的指令特性,并基于RISC-V指令集架构,重定制了RV32C压缩指令扩展集。经过Spike模拟器测试,优化后压缩率缩减至70%,动态指令压缩率为90%,同时在同等Cache条件下,使用定制压缩指令的指令Cache失效率比标准RISC-V降低了30%~70%。相似文献

18.

FEADS: A Framework for Exploring the Application Design Space on Network Processors

Rajani Pai R. Govindarajan 《International journal of parallel programming》2007,35(1):1-31

Network processors are designed to handle the inherently parallel nature of network processing applications. However, partitioning and scheduling of application tasks and data allocation to reduce memory contention remain as major challenges in realizing the full performance potential of a given network processor. The large variety of processor architectures in use and the increasing complexity of network applications further aggravate the problem. This work proposes a novel framework, called FEADS, for automating the task of application partitioning and scheduling for network processors. FEADS uses the simulated annealing approach to perform design space exploration of application mapping onto processor resources. Further, it uses cyclic and r-periodic scheduling to achieve higher throughput schedules. To evaluate dynamic performance metrics such as throughput and resource utilization under realistic workloads, FEADS automatically generates a Petri net (PN) which models the application, architectural resources, mapping and the constructed schedule and their interaction. The throughput obtained by schedules constructed by FEADS is comparable to that obtained by manual scheduling for linear task flow graphs; for more complicated task graphs, FEADS’ schedules have a throughput which is upto 2.5 times higher compared to the manual schedules. Further, static scheduling of tasks results in an increase in throughput by upto 30% compared to an implementation of the same mapping without task scheduling. 相似文献

19.

Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures

Eric Hao Po-Yung Chang Marius Evers Yale N. Patt 《International journal of parallel programming》1998,26(4):449-478

To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers of functional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, called block enlargement, that can be applied to a block-structured ISA to increase the instruction fetch rate of a processor that implements that ISA. We have constructed a compiler that generates block-structured ISA code, and a simulator that models the execution of that code on a block-structured ISA processor. We show that for the SPECint95 benchmarks, the block-structured ISA improves the performance of an aggressive wide issue, dynamically scheduled processor by 15% while using simpler microarchitectural mechanisms to support wide issue and dynamic scheduling. 相似文献

20.

协作式全局指令调度与寄存器分配 总被引：1，自引：1，他引：0

吴承勇连瑞琦张兆庆乔如良《计算机学报》2000,23(5):493-499

指令级并行是现代高性能代理器的重要特征,对于发挥这类处理器所具有的并行处理能力来说,编译器有至关重要的影响。文中讨论指令级并行编译中的核心问题－全局指令调度与器分配,并以作者为一种新型的显式并行体系结构微处理器的编译系统为背景,介绍了此类编译器后端设计中面临的指令调度与寄存器分配的时序问题,以及为解决这一问题而提出了的一种协作式全局指令调度与寄存器分配方法。相似文献