首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 126 毫秒
1.
传统网络缓存系统中数据包级别的缓存难以实现,信息中心网络的出现使这个难题得以缓解,但数据包级别的缓存仍然面临严重的扩展性问题。通过分析当前限制数据包级别缓存实现的若干问题,提出了一种分组报文缓存优化方法。该方法通过根据分组前缀而非单个报文前缀建立索引来减少高速存储器的使用量,同时分组级别的流行度也用于优化缓存决策。定义了大量的评估指标,并通过广泛的实验来评估此方案的性能。实验结果表明,与之前的数据包级别的缓存方案相比,该方法可以大大减少高速存储器使用量,并且在服务器负载减少率、平均跳数减少率和平均缓存命中率方面取得显着改善。  相似文献   

2.
视频技术发展要求更高速,更利于硬件实现的运动估计算法.提出了一种蝶形运动估计算法,该算法采用蝶形搜索模板、快速截止技术和运动向量预测技术.该算法较钻石搜索算法提速43.26%-80%,并且图像质量更好.同时,本文采用加法树和片内并行存储器,构建该算法的VLSI实现结构.通过两种数据映射方法(拉丁方映射和4×4块映射),该结构不但解决了快速搜索算法的数据不规则性难题,并且节省了带宽.当系统时钟为27MHz,数据总线为16位,外部存储器带宽要求仅为4.57Mbit/s.比较其它硬件实现结构,该结构采用了更少的处理单元数,更小的缓存单元,但却获得更快的速度和更高的灵活性.  相似文献   

3.
为了解决3G网络监管系统中用户数据、状态与用户身份绑定的问题,提出了一种基于FPGA的高速数据包重组方法.在对GTP协议分析的基础上,根据FPGA的特点,提出了一种硬件可实现的数据存储查找算法.该算法利用FPGA内部的BRAM作缓存,FPGA外部的DDR2作存储单元,实现对大规模用户身份信息的高速动态存储与查找.借助于ISE综合布线工具和Xilinx硬件平台验证,可以实现对百万用户数据重组,处理速度达到1.6Gbps,实现了对3G核心网数据实时处理.  相似文献   

4.
基于FPGA的高速采样缓存系统的设计与实现   总被引:1,自引:0,他引:1  
郑争兵 《计算机应用》2012,32(11):3259-3261
为了提高高速数据采集系统的实时性,提出一种基于FPGA+DSP的嵌入式通用硬件结构。在该结构中,利用FPGA设计一种新型的高速采样缓存器作为高速A/D和高性能DSP之间数据通道,实现高速数据流的分流和降速。高速采样缓存器采用QuartusⅡ9.0 软件提供的软核双时钟FIFO构成乒乓操作结构,在DSP的外部存储器接口(EMIFA)接口的控制下,完成高速A/D的数据流的写入和读出。测试结果表明:在读写时钟相差较大的情况下,高速采样缓存器可以节省读取A/D采样数据时间,为DSP提供充足的信号处理时间,提高了整个系统的实时性能。  相似文献   

5.
刘祯  刘斌  郑凯 《软件学报》2007,18(12):3115-3123
路由器需要以较低的代价灵活、高速地实现路由查找这一基本功能.为网络处理器设计了一种基于软件的路由查找高速缓存算法.网络处理器片上高速存储器中的一部分空间被划分出来,由指令代码来维护一个路由查找结果缓存表.通过选择合适的哈希函数,平衡表项之间的冲突并刷新复杂度,该算法可以缩短路由查找的延迟,减少多处理单元对存储器总线的竞争,为其他网络应用提供更多的处理时间.基于真实网络流量的实验表明,即便每个处理单元中仅有少量表项,网络处理器的吞吐量仍然可以得到有效的提升.  相似文献   

6.
随着网络带宽的不断增长,迫切需要时空高效的数据包处理技术,满足线速处理和低存储需求。在高速片上存储器上存储所有的攻击特征,可以实现对数据包的高速检测,但受限于有限的片上存储器空间。通过基于划分位构建无冲突哈希函数,实现对片上存储器有效的控制,攻击特征平均分配到trie树每层的多个组中。该结构可以在同一个芯片中实现流水并行地执行,获得比较大的吞吐量。理论及实验表明该方法在片上存储器一次就执行完复杂的完全匹配操作,显著地降低片上存储空间需求。  相似文献   

7.
提出了一种采用输入缓存MSM结构的Clos网络,该结构适用于高速交换网络。提出了这一结构中的路由算法,该算法采用正交分路的方法来减小网络内部的冲突,引入路由优先级来提高网络内部的链路利用率,使用优先级轮转来均衡网络内部负载。针对这一路由算法,还给出了与之对应的信元调度算法。仿真表明,尽管采用共享缓存的MSM结构内部使用了很高的加速比,但是采用了正交分路的路由算法后,输入缓存MSM结构,可以获得比共享缓存MSM结构更好的时延及吞吐性能,更适合在高速大容量多端口的路由器或交换机中采用。  相似文献   

8.
数据缓存技术可以有效地减少网络拥塞,减轻服务器负载,加快信息访问速度.通过部署一组地域分布的缓存节点相互协作处理用户请求,可以进一步提高系统性能.在分布式缓存系统中,一个值得关注的问题是优化缓存的放置,使访问开销最小化.首先建立了一个理论模型来分析缓存副本放置对系统访问开销的影响.基于这个模型,缓存放置问题可以形式化地描述成一个最优化问题,提出了一种图算法来解决该问题.图算法使用修改的Dijkstra算法在访问代价图中寻找一条最短路径,该路径对应一种最优的缓存部署.理论上证明了图算法的正确性,并使用仿真实验对其性能进行评估.实验结果表明,图算法的性能优于大部分现有的分布式缓存机制.  相似文献   

9.
为解决高速CPU与低速主存储器两者速度的平衡和匹配问题,提出一种并行高速读取的存储器模型,对其结构组成、基本原理、数据读取算法以及时间估算作了分析,最后通过仿真实验验证了其正确性和可行性.该模型具备一般的猝发式存储器和双端口存储器的数据读取特点,可以高速读取成组连续数据,大幅度地提高了微处理器系统的整体性能.  相似文献   

10.
嵩天  李冬妮  汪东升  薛一波 《软件学报》2013,24(7):1650-1665
多模式匹配是基于内容检测的网络安全系统的重要功能,同时,它在很多领域具有广泛的应用.实际应用中,高速且性能稳定的大规模模式匹配方法需求迫切,尤其是能够在线实时处理网络包的匹配体系结构.介绍了一种存储有效的高速大规模模式匹配算法及相关体系结构.研究从算法所基于的理论入手,提出了缓存状态机模型,并结合状态机中转换规则分类,提出了交叉转换规则动态生成的匹配算法ACC(Aho-Corasick-CDFA).该算法通过动态生成转换规则降低了生成状态机的规模,适用于大规模模式集.进一步提出了基于该算法的体系结构设计.采用网络安全系统中真实模式集进行的实验结果表明,该算法相比其他状态机类模式匹配算法,可以进一步减少80%~95%的状态机规模,存储空间降低40.7%,存储效率提高近2 倍,算法单硬件结构实现可以达到11Gbps 的匹配速度.  相似文献   

11.
Demands on data communication networks continue to drive the need for increasingly faster link speeds. Optical packet switching networks promise to provide data rates that are sufficiently high to satisfy the needs of the future Internet core network. However, a key technological problem with optical packet switching is the very small size of packet buffers that can be implemented in the optical domain. Existing protocols, for example the widely used Transmission Control Protocol (TCP), do not perform well in such small-buffer networks. To address this problem, we have proposed techniques for actively pacing traffic at edge networks to ensure that traffic bursts are reduced or eliminated and thus do not cause packet losses in routers with small buffers. We have also shown that this traffic pacing can improve the performance of conventional networks that use small buffers (e.g., to reduce the cost of buffer memory on routers). A key challenge in this context is to develop systems that can perform such packet pacing efficiently and at high data rates. In this paper, we present the design and prototype of a hardware implementation of our packet pacing technique. We discuss and evaluate design trade-offs and present performance results from an prototype implementation based on a NetFPGA fieldprogrammable gate array system. Our results show that traffic pacing can be implemented with few hardware resources and without reducing system throughput. Therefore, we believe that traffic pacing can be deployed widely to improve the operation of current and future networks.  相似文献   

12.
本文建立了具有输入与输出缓冲器非阻塞分组交换网络的排队模型,得出了平均排队长度、平均等待时间和饱和吞吐量的结果,指出可通过增加交换容量和降低先进先出的排队规则的限制来提高分组交换的吞吐量,最后提出了一种限制交换容量具有重复竞争机制的改进方案,对分组丢失概率进行了定量分析。从而确定了该结构的指标体系。  相似文献   

13.
报文交换采用报文缓冲区来存储调度输出端口的数据包,而缓冲区的读写速度往往决定了T-Bit路由器自身的性能。针对目前的DRAM读写速率较低这一缺陷,提出了一种利用DDR内存来实现支持QOS的高速大容量的缓存机制。实现了一种支持12路2.5GbpsIP报文调度工程方案,该方案可保证调度输出端口速率可达10Gbps。  相似文献   

14.
Dias  D.M. Jump  J.R. 《Computer》1981,14(12):43-53
Adding buffers to a packet switching network can increase throughput in certain system architectures. A word of warning—don't make them too large.  相似文献   

15.
This paper addresses the design of high-performance buffers for high-end Internet routers. The buffers are typically implemented using a combination of SRAM and DRAM technologies in order to simultaneously meet the routers' high speed and capacity requirements. The major challenge in designing router buffers is to maintain multiple flow queues in the memory, unlike computer memory buffers (i.e., memory system). The major objective is to minimize the use of expensive but fast SRAM while providing acceptable delay guarantees to packets. In this paper, we first investigate hybrid SRAM/DRAM solutions proposed in the past. We show that one of the architectural limitations of these solutions is that the required SRAM size grows linearly with the number of flows in the system. This prevents the solutions from scaling to support a large number of flows. We then break down this shortcoming by proposing a parallel hybrid SRAM/DRAM (PHSD) architecture. We design a series of memory management algorithms (MMAs) for PHSD, based on tradeoffs between the complexity of the MMAs and the guarantee of in-order delivery of packets (segmentations). We perform a detailed analysis of the proposed algorithms and conduct extensive simulations to show that PHSD can significantly outperform solutions proposed in the past in terms of the SRAM requirements and packet delay.  相似文献   

16.
Most of the current communication networks, including the Internet, are packet switched networks. One of the main reasons behind the success of packet switched networks is the possibility of performance gain due to multiplexing of network bandwidth. The multiplexing gain crucially depends on the size of the buffers available at the nodes of the network to store packets at the congested links. However, most of the previous work assumes the availability of infinite buffer-size. In this paper, we study the effect of finite buffer-size on the performance of networks of interacting queues. In particular, we study the throughput of flow-controlled loss-less networks with finite buffers. The main result of this paper is the characterization of a dynamic scheduling policy that achieves the maximal throughput with a minimal finite buffer at the internal nodes of the network under memory-less (e.g., Bernoulli IID) exogenous arrival process. However, this ideal performance policy is rather complex and, hence, difficult to implement. This leads us to the design of a simpler and possibly implementable policy. We obtain a natural trade-off between throughput and buffer-size for such implementable policy. Finally, we apply our results to packet switches with buffered crossbar architecture  相似文献   

17.
SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be utilized in parallel to improve the throughput further. However, several challenges must be addressed to make such solutions feasible. First, the memory distribution over different pipelines, as well as across different stages of each pipeline, must be balanced. Second, the traffic among these pipelines should be balanced. Third, the intra-flow packet order (i.e. the sequence) must be preserved. In this paper, we propose a parallel SRAM-based multi-pipeline architecture for IP lookup. A two-level mapping scheme is developed to balance the memory requirement among the pipelines as well as across the stages in each pipeline. To balance the traffic, we propose an early caching scheme to exploit the data locality inherent in the architecture. Our technique uses neither a large reorder buffer nor complex reorder logic. Instead, a flow-aware queuing scheme exploiting the flow information is used to maintain the intra-flow sequence. Extensive simulation using real-life traffic traces shows that the proposed architecture with 8 pipelines can achieve a throughput of up to 10 billion packets per second, i.e. 3.2 Tbps for minimum size (40 bytes) packets, while preserving intra-flow packet order.  相似文献   

18.
Reducing the size of packet buffers in network equipment is a straightforward method for improving the network performance experienced by user applications and also the energy efficiency of system designs. Smaller buffers imply lower queueing delays, with faster delivery of data to receivers and shorter round-trip times for better controlling the size of TCP congestion windows. If small enough, downsized buffers can even fit in the same chips where packets are processed and scheduled, avoiding the energy cost of external memory chips and of the interfaces that drive them. On-chip buffer memories also abate packet access latencies, further contributing to system scalability and bandwidth density. Unfortunately, despite more than two decades of intense research activity on buffer management, current-day system designs still rely on the conventional bandwidth-delay product rule to set the size of their buffers. Instead of decreasing, buffer sizes keep on growing linearly with link capacities.We draw from the limitations of the buffer management schemes that are commonly available in commercial network equipment to define Periodic Early Detection (PED), a new active queue management scheme that achieves important buffer size reductions (more than 95%) while retaining TCP throughput and fairness. We show that PED enables on-chip buffer implementations for link rates up to 100 Gbps while relieving end users from network performance disruptions of common occurrence.  相似文献   

19.
This article presents NePSim, an integrated system that includes a cycle-accurate architecture simulator, an automatic formal verification engine, and a parameterizable power estimator for NPs consisting of clusters of multithreaded execution cores, memory controllers, I/O ports, packet buffers, and high-speed buses. To perform concrete simulation and provide reliable performance and power analysis, we defined our system to comply with Intel's IXP1200 processor specification because academia has widely adopted it as a representative model for NP research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号