期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

季园园张涛王海鹏《计算机工程》2013,39(2):279-282

针对多核环境中操作系统的线程调度问题,提出一种基于线程流水线的线程调度策略。基于片上多线程处理器,借鉴流水线技术的并行优势,引入线程流水线的概念。通过确定线程特征指标,计算线程流水线的聚合度及对应线程的吻合度,从而完成线程调度,并在此基础上对其进行嵌入式方向的优化。模拟真实环境的实验结果表明,与基于静态优先级的调度策略相比,该策略消耗时间较少。相似文献

2.

GCC的流水冲突识别器和并行调度器

叶崴马杰侯朝焕《计算机工程与应用》2005,41(20):10-11,18

由于超长指令字处理器通常都有多级流水线和复杂的资源使用限制,如何准确地描述处理器的流水线模型,快速地判断是否存在资源冲突并不是个简单地任务。文章介绍GCC新引入的正则表达式语法的流水线描述机制。在将GCC移植到笔者所开发的SuperV芯片的过程中,利用该机制对SuperV芯片的流水线结构和资源使用限制进行详尽地描述,启动了GCC的指令级并行调度。通过并行调度,测试程序的性能提高了大约6%—35%。相似文献

3.

多核处理平台上任务图模型的并行调度策略研究

周本海乔建忠林树宽《小型微型计算机系统》2012,33(11)

凭借着高性能,低功耗的特性,多核处理器已经占据了目前的主要市场.提出一种多核处理平台上基于任务图模型的调度策略.建立了多核平台上任务图的空间与时间并行调度模型;针对任务图的空间并行与时间并行调度模型提出了并行节点合并、分配的优化算法与流水线并行的优化算法.最后,提出将优化的空间与时间并行调度技术相结合的并行调度策略.通过实验验证,本文提出的算法比其他多核并行调度算法降低了处理器核心间的通信与同步开销,提高了系统的计算效率与吞吐量. 相似文献

4.

基于最高响应比法和匈牙利算法的调度系统在流水线、仓储系统中的应用研究

丁希辰吴竹筠王怡婷王杰《工业控制计算机》2012,25(12):106-107,110

为了将孤立的流水线系统和仓储系统有机的联系起来,基于最高响应比法和匈牙利算法的调度系统设计了其在流水线、仓储系统中的优化应用,建立了最高响应比法和匈牙利算法的数学模型,我们认为,通过该调度系统能够提高流水线系统和仓储系统的工作效率,并能进一步优化其任务的调度机制。相似文献

5.

支持 AltiVec技术的多媒体协处理单元的研究 *

黄小平樊晓桠张盛兵《计算机应用研究》2008,25(10):3161-3164

通过对嵌入式处理器进行多媒体处理能力的扩展可增强其对多媒体数据的处理能力。以 32 bit龙腾嵌入式处理器为基础 ,研究 AltiVec技术以及超标量技术 ,设计了该处理器中支持 AltiVec技术的多媒体协处理单元。该单元采用五级流水线 ,将指令动态调度技术分配到不同的流水线中 ,在提高处理性能的同时保证了设计频率。通过多媒体基准程序测试 ,该单元的指令 IPC为 1. 2, SMIC0. 18μm工艺库下 ,频率为 350 MHz,该协处理单元提高了龙腾处理器的性能。相似文献

6.

静态超标量MCU-DSP内核的Load先行访存调度

刘博张盛兵黄嵩人《计算机应用研究》2013,30(2):450-453

针对嵌入式控制与数字信号处理混合应用领域,建立了一种基于MCU-DSP融合架构处理器的Load先行机制.该内核使用静态超标量技术,拥有整数、存取、循环三条流水线,并采用特殊的四级流水.在存取流水线中,Load先行机制通过动态调度指令的访存顺序,实现了Load指令对Store指令的先行,提前了整数流水线中运算操作数的准备,加快了流水线的处理速度. 相似文献

7.

周期精确ASIP仿真器生成环境的研究

李曦仲力高妍妍《计算机仿真》2008,25(5):290-293

周期精确仿真器是ASIP(专用指令集处理器)开发过程中的关键工具.介绍了一种由体系结构描述语言mtADL驱动的周期精确ASIP仿真器的快速生成环境.mtADL可以简洁精确地描述嵌入式领域最常见的2种微体系结构(简单流水线和Tomasulo动态调度流水线).仿真器生成器mtGEN能够根据mtADL的描述,自动生成周期精确的仿真器.介绍了mtGEN使用的自动生成算法.在实验部分,对5级流水MIPS、3级流水ARM7和动态调度MIPS这三种差异很大的处理器实现了周期精确仿真器自动生成,从而证明了方法的正确性和有效性. 相似文献

8.

32位多线程包处理微引擎的设计 总被引：1，自引：0，他引：1

周昔平高德远樊晓桠张盛兵《小型微型计算机系统》2006,27(11):2072-2076

硬件多线程技术是网络处理器中的核心技术，本文介绍了一个专门面向网络协议处理的硬件多线程包处理微引擎NRS05的设计，详细介绍了其流水线的整体结构，提出了一种基于混合多线程的动态调度策略实现了长延时操作的隐藏，保证单线程性能能够满足应用需求的同时保证了各线程在执行核上运行的公平性，并将多线程技术和流水线技术进行了结合，解决了传统处理器中指令间因控制相关导致的流水线停顿问题，最后给出了设计的综合结果及包处理性能．相似文献

9.

基于新型分布式算法的批量流水线调度方法研究

《信息与电脑》2019,(7)

在企业生产经营活动中,生产计划是最重要的依据,而生产计划是由调度系统来实施完成的,批量流水线调度问题是一个合理分配资源的过程,从而达到优化一个或多个目标的目的。优化的批量流水线调度方案,能提高企业的生产效率,并在一定程度上降低生产成本,目前的批量流水线调度方案存在一定问题,笔者提出基于新型分布式算法的批量流水线调度方法。相似文献

10.

现代高性能处理器PowerPC620与Alpha21164的核心技术分析 总被引：2，自引：0，他引：2

胡良校陈耀强《小型微型计算机系统》1997,18(6):38-45

ＰｏｗｅｒＰＣ６２０和Ａｌｐｈａ２１１６４是当今世界上的两种高性能的处理器，它们的实现体现了两种截然不同的高性能处理器设计思想，故从体系结构、指令流水线性、指令调度规则、转移处理、存储系统等角度对他们作一详细分析，有助于了解当今高性能处理器的核心技术和指令级并行处理技术的发展方向。相似文献

11.

一种面向汇聚网络处理器模型的负载均衡算法

下载免费PDF全文

时向泉苏金树《计算机工程与科学》2008,30(12):1-4

本文提出一种由多个网络处理器组成的汇聚网络处理器转发子系统路由器结构模型,并设计了算法DIHDA对多网络处理器的负载进行均衡分配。实验结果表明,该算法能够在保持负载均衡的同时获得较好的报文保序效果,综合性能优于目前已有的同类算法。相似文献

12.

Improving IPS by network processors

Pablo Cascón Julio Ortega Yan Luo Eric Murray Antonio Díaz Ignacio Rojas 《The Journal of supercomputing》2011,57(1):99-108

Many present applications usually require high communication throughputs. Multiprocessor nodes and multicore architectures, as well as programmable NICs (Network Interface Cards) provide new opportunities to take advantage of the available multigigabits per second link bandwidths. Nevertheless, to achieve adequate communication performance levels efficient parallel processing of network tasks and interfaces should be considered. In this paper, we leverage network processors as heterogeneous microarchitectures with several cores that implement multithreading and are suited for packet processing, to investigate on the use of parallel processing to accelerate the network interface, and thus the network applications developed above it. More specifically, we have implemented an intrusion prevention system (IPS) with such a network processor. We describe the IPS we have developed that after its offloaded implementation allows faster packet processing of both normal and corrupted traffic. The benefits from placing the IPS close to the network, by using specialized network processors, give many times lower latency and higher bandwidth available to the legitimate traffic. 相似文献

13.

Comparative packet-forwarding measurement of three popular operating systems

K. Salah M. Hamawi 《Journal of Network and Computer Applications》2009,32(5):1039-1048

This papers measures and compares the network performance (with respect to packet forwarding) of three popular operating systems when used in today's Gigabit Ethernet networks. Specifically, the paper compares the performance in terms of packet forwarding of Linux, Windows Server and Windows XP. We measure both kernel- and user-level packet forwarding when subjecting hosts to different traffic load conditions. The performance is compared and analyzed in terms of throughput, packet loss, delay, and CPU availability. Our evaluation methodology is based on packet-forwarding measurement which is a standard and popular benchmark and evaluation methodology to assess the performance of network elements such as servers, gateways, routers, and switches. Our evaluation methodology considers different configuration setups and utilizes open-source software tools to generate relatively high traffic rates. We consider today's typical network hosts of modern processors and Gigabit network cards. Our measurements show that in general Linux exhibits superior overall performance in the case of kernel (or IP) packet forwarding, whereas Windows Server exhibits superior performance in the case of user-level packet forwarding. 相似文献

14.

Performance models for network processor design

Wolf T. Franklin M.A. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(6):548-561

To provide a variety of new and advanced communications services, computer networks are required to perform increasingly complex packet processing. This processing typically takes place on network routers and their associated components. An increasingly central component in router design is a chip-multiprocessor (CMP) referred to as "network processor" or NP. In addition to multiple processors, NPs have multiple forms of on-chip memory, various network and off-chip memory interfaces, and other specialized logic components such as CAMs (content addressable memories). The design space for NPs (e.g., number of processors, caches, cache sizes, etc.) is large due to the diverse workload, application requirements, and system characteristics. System design constraints relate to the maximum chip area and the power consumption that are permissible while achieving defined line rates and executing required packet functions. In this paper, an analytic performance model that captures the processing performance, chip area, and power consumption for a prototypical NP is developed and used to provide quantitative insights into system design trade offs. The model, parameterized with a networking application benchmark, provides the basis for the design of a scalable, high-performance network processor and presents insights into how best to configure the numerous design elements associated with NPs. 相似文献

15.

A highly flexible,distributed multiprocessor architecture for network processing

《Computer Networks》2003,41(5):563-586

Network processors (NPs) are an emerging field of programmable processors that are optimized to implement data plane packet processing networking functions. Unlike the general-purpose CPUs that rely heavily on caching for improving performance, the lack of locality in packet processing and need for high-performance I/O have forced designers to come up with innovative architectures that can hide memory latency while still processing packets at high data rates. Most of these NPs use some type of multiprocessing in combination with a hierarchy of memory types to achieve high performance. In addition, to keep up with packets arriving at high data rates over multiple incoming media interfaces, an NP must perform fast I/O and memory operations such as packet storage, table lookup, and extraction of fields in packet headers. We describe an architecture that uses a combination of distributed memory architecture and one or more multithreaded processors to achieve the necessary performance. We describe the challenges in programming such a processor including the issues related to consistency and maintaining packet ordering. We also present a programming model for generic network applications that uses software pipelines. We then demonstrate the use of the programming model in implementing two applications, namely, mapping traffic management algorithms onto a multithreaded architecture and an implementation of a media gateway based on voice-over-AAL2. 相似文献

16.

网络处理器中处理单元的设计与实现

下载免费PDF全文

李诚李华伟《计算机工程》2007,33(2):252-254

随着网络带宽的飞速增长和各种新的网络应用不断涌现，原有的基于通用处理器和ASIC的互联网架构已经不能满足新的需求。兼具强大处理能力和灵活可编程配置能力的网络处理器逐渐得到广泛的应用。高性能的网络处理器通常采用多个并发的处理单元进行数据平面的快速处理，这些处理单元在网络处理器中居于核心的地位。该文讨论了网络处理器中处理单元设计需要考虑的因素，设计了一种较为灵活有效的处理单元架构，并进行了FPGA原型验证，证实了该结构的可行性。相似文献

17.

Myrinet: a gigabit-per-second local area network 总被引：2，自引：0，他引：2

Boden N.J. Cohen D. Felderman R.E. Kulawik A.E. Seitz C.L. Seizovic J.N. Wen-King Su 《Micro, IEEE》1995,15(1):29-36

The Myrinet local area network employs the same technology used for packet communication and switching within massively parallel processors. In realizing this distributed MPP network, we developed specialized communication channels, cut-through switches, host interfaces, and software. To our knowledge, Myrinet demonstrates the highest performance per unit cost of any current LAN 相似文献

18.

A high-performance and scalable multi-core aware software solution for network monitoring

Mahdi Dashtbozorgi Mohammad Abdollahi Azgomi 《The Journal of supercomputing》2012,59(2):720-743

In recent years, the need for high-performance network monitoring tools, which can cope with rapidly increasing network bandwidth, has become vital. A possible solution is to utilize the processing power of multi-core processors that nowadays are available as commercial-off-the-shelf (COTS) hardware. In this paper, we introduce a software solution for wire-speed packet capturing and transmission for TCP/IP networks under Linux operating system, called DashCap. The results of our experimental evaluations show that the proposed solution causes more than two times performance boost for packet capturing in comparison to the existing software solutions under Linux. We have proposed a scalable software architecture for network monitoring tools called DashNMon, which is based on DashCap. Multi-core awareness is a distinguished property of this architecture. Comparing to the existing cluster-based solutions, DashNMon can be used with COTS multi-core processors. In order to evaluate the proposed solutions, we have developed several prototype tools. The results of the experiments carried out using these tools show the scalability and high performance of the network monitoring tools that are based on the proposed architecture. Using the proposed architecture, it is possible to design and implement high-performance multi-threaded network intrusion detection systems (NIDSs) or application-layer firewalls, completely in the user space and with better utilization of the computational resources of multi-processor/multi-core systems. 相似文献

19.

Optimal communication algorithms for hypercubes

D. P. Bertsekas C. zveren G. D. Stamoulis P. Tseng J. N. Tsitsiklis 《Journal of Parallel and Distributed Computing》1991,11(4)

We consider the following basic communication problems in a hypercube network of processors: the problem of a single processor sending a different packet to each of the other processors, the problem of simultaneous broadcast of the same packet from every processor to all other processors, and the problem of simultaneous exchange of different packets between every pair of processors. The algorithms proposed for these problems are optimal in terms of execution time and communication resource requirements; that is, they require the minimum possible number of time steps and packet transmissions. In contrast, algorithms in the literature are optimal only within an additive or multiplicative factor. 相似文献

20.

Synchronous dataflow architecture for network processors 总被引：1，自引：0，他引：1

Carlstrom J. Boden T. 《Micro, IEEE》2004,24(5):10-18

Network processors are programmable, highly integrated communications circuits optimized to provide processing at high data and packet rates. The packet instruction set computer (PISC) architecture is a synchronous dataflow architecture developed for network processors. It uses a deep pipeline that contains two types of processing elements: PISC processors, which perform programmable data manipulation, and I/O processors, which provide access to shared resources such as look-up table memory, hardware accelerators, or coprocessors. 相似文献