期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

梁静陈志坚孟建熠《计算机应用研究》2012,29(7):2491-2493

为了减少高速缓存访问功耗,提出了一种针对循环的基于历史访问路径的指令高速缓存访问预测方法。该方法以循环作为高速缓存访问路预测行为开启的先决条件,通过指令高速缓存的历史访问路径训练预测器。当循环体再次进入时选择对应的访问路径预测器,获取目标指令高速缓存的路进行访问,降低访问功耗。并进一步提出多路径路预测方法,以得到更高的预测准确率。基于Powerstone测试基准的实验结果表明,该预测方法能达到99%的预测准确率。相比传统的指令高速缓存,使用本方法的高速缓存可平均降低65%的访问功耗,仅增加约0.2%的平均指令高速缓存访问周期。相似文献

2.

基于内容热度与节点介数的NDN网络缓存策略

郭晨郑烇丁尧王嵩《计算机系统应用》2017,26(12):165-169

缓存技术是数据命名网络（Named data networking,NDN）的关键技术之一. NDN传统的LCE缓存策略会造成较大的冗余. 改进的RCOne策略采用随机放置的方法,没有利用任何内容、节点信息,对网络缓存性能的提升有限. Betw策略只考虑到节点介数,导致高介数节点缓存更替频繁,当节点缓存容量远小于内容总量时,缓存性能下降. 为了解决这些问题,本文提出一种结合内容热度与节点介数的新型缓存策略HotBetw（Hot content placed on node with high Betweenness）,充分利用内容与节点信息选择最佳的位置放置缓存. 仿真实验表明相对于典型NDN缓存策略,HotBetw缓存策略在提高缓存命中率、降低平均跳数方面具有很好的效果. 相似文献

3.

多缓存容量场景下的D2D内容缓存布设优化方案

龙彦汕吴丹蔡跃明王萌郭继斌《计算机应用》2018,38(5):1453-1457

在终端直传（D2D）缓存网络中,用户有限且各异的缓存能力是制约缓存效率的一个关键参数,然而现有文献大多考虑不同用户具有相同的缓存能力,针对这一不足有必要进行用户具有不同缓存容量下的D2D内容缓存布设方案优化。首先,鉴于用户终端的移动性和随机分布特性,利用随机几何理论将网络中不同缓存容量的用户节点建模为相互独立的齐次泊松点过程;其次,考虑本地卸载和邻近D2D链路卸载两种内容卸载方式,推导得到网络缓存命中率;最后,将最大化缓存命中率作为优化目标函数,提出了基于坐标梯度的联合缓存布设（JCP）算法,从而得到多用户多缓存容量场景下的内容缓存布设方案。仿真结果表明,与现有的缓存布设方案相比,由JCP得到的缓存布设方案可以有效提高缓存命中率。相似文献

4.

基于时空局部性的层次化查询结果缓存机制

朱亚东郭嘉丰兰艳艳程学旗《中文信息学报》2016,30(1):63-71

查询结果缓存可以对查询结果的文档标识符集合或者实际的返回页面进行缓存,以提高用户查询的响应速度,相应的缓存形式可以分别称之为标识符缓存或页面缓存。对于固定大小的内存,标识符缓存可以获得更高的命中率,而页面缓存可以达到更高的响应速度。该文根据用户查询访问的时间局部性和空间局部性,提出了一种新颖的基于时空局部性的层次化结果缓存机制。首先,该机制将固定大小的结果缓存划分为两层:页面缓存和标识符缓存。对于用户提交的查询,该机制会首先使用第一层的页面缓存进行应答,如果未能命中,则继续尝试使用第二层的标识符缓存。实验显示这种层次化的缓存机制较传统的仅依赖于单一缓存形式的机制,在平均查询响应时间上,取得了可观的性能提升:例如,相对单纯的页面缓存,平均达到9%,最好情况下达到11%。其次,该机制在标识符缓存的基础上,设计了一种启发式的预取策略,对用户查询检索的空间局部性进行挖掘。实验显示,这种预取策略的融合,能进一步促进检索系统性能的有效提升,从而最终建立起一套时空完备的、有效的结果缓存机制。相似文献

5.

Stack evaluation of arbitrary set-associative multiprocessor caches

Yuguang Wu Muntz R. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(9):930-942

We propose a simple solution to the problem of efficient stack evaluation of LRU multiprocessor cache memories with arbitrary set-associative mapping. It is an extension of the existing stack evaluation techniques for all set-associative LRU uniprocessor caches. Special marker entries are used in the stack to represent data blocks (or lines) deleted by an invalidation-based cache coherence protocol. A method of marker-splitting is employed when a data block below a marker in the stack is accessed. Using this technique, one-pass trace evaluation of memory access trace yields hit ratios for all cache sizes and set-associative mappings of multiprocessor caches in a single pass over a memory reference trace. Simulation experiments on some multiprocessor trace data show an order-of-magnitude speed-up in simulation time using this one-pass technique 相似文献

6.

Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping

Chang Chih-Yung Sheu Jang-Ping Chen Hsi-Chiuen 《The Journal of supercomputing》2002,22(2):197-219

This article presents an algorithm to reduce cache conflicts and improve cache localities. The proposed algorithm analyzes locality reference space for each reference pattern, partitions the multi-level cache into several parts with different sizes, and then maps array data onto the scheduled cache positions to eliminate cache conflicts. A greedy method for rearranging array variables in declared statement is also developed, to reduce the memory overhead for mapping arrays onto a partitioned cache. Besides, loop tiling and the proposed schemes are combined to exploit opportunities for both temporal and spatial reuse. Atom is used as a tool to develop a simulation of the behavior of the direct-mapping cache to demonstrate that our approach is effective at reducing number of cache conflicts and exploiting cache localities. Experimental results reveal that applying the cache partitioning scheme can greatly reduce the cache conflicts and thus save program execution time in both single-level cache and multi-level cache hierarchies. 相似文献

7.

A scalable Web cache sharing scheme

Yong H Shin 《Information Processing Letters》2004,91(5):227-232

A new Web cache sharing scheme is presented. Our scheme reduces the duplicated copies of the same objects in global shared Web caches. It also reduces the message overhead of existing schemes significantly. Trace-driven simulations with actual Web cache logs show that the proposed scheme performs better than the two well-known Web cache sharing schemes, the Internet Cache Protocol and the Cache Array Routing Protocol. 相似文献

8.

Linked instruction caches for enhancing power efficiency of embedded systems

Chang-Jung Ku Ching-Wen Chen An Hsia Chun-Lin Chen 《Microprocessors and Microsystems》2014

The power consumed by memory systems accounts for 45% of the total power consumed by an embedded system, and the power consumed during a memory access is 10 times higher than during a cache access. Thus, increasing the cache hit rate can effectively reduce the power consumption of the memory system and improve system performance. In this study, we increased the cache hit rate and reduced the cache-access power consumption by developing a new cache architecture known as a single linked cache (SLC) that stores frequently executed instructions. SLC has the features of low power consumption and low access delay, similar to a direct mapping cache, and a high cache hit rate similar to a two way-set associative cache by adding a new link field. In addition, we developed another design known as a multiple linked caches (MLC) to further reduce the power consumption during each cache access and avoid unnecessary cache accesses when the requested data is absent from the cache. In MLC, the linked cache is split into several small linked caches that store frequently executed instructions to reduce the power consumption during each access. To avoid unnecessary cache accesses when a requested instruction is not in the linked caches, the addresses of the frequently executed blocks are recorded in the branch target buffer (BTB). By consulting the BTB, a processor can access the memory to obtain the requested instruction directly if the instruction is not in the cache. In the simulation results, our method performed better than selective compression, traditional cache, and filter cache in terms of the cache hit rate, power consumption, and execution time. 相似文献

9.

Proxy Cache Replacement Algorithms: A History-Based Approach

Vakali Athena 《World Wide Web》2001,4(4):277-297

Accesing and circulation of Web objects has been facilitated by the design and implementation of effective caching schemes. Web caching has been integrated in prototype and commercial Web-based information systems in order to reduce the overall bandwidth and increase system's fault tolerance. This paper presents an overview of a series of Web cache replacement algorithms based on the idea of preserving a history record for cached Web objects. The number of references to Web objects over a certain time period is a critical parameter for the cache content replacement. The proposed algorithms are simulated and experimented under a real workload of Web cache traces provided by a major (Squid) proxy cache server installation. Cache and bytes hit rates are given with respect to different cache sizes and a varying number of request workload sets and it is shown that the proposed cache replacement algorithms improve both cache and byte hit rates. 相似文献

10.

Two fast and high-associativity cache schemes

Chenxi Zhang Xiaodong Zhang Yong Yan 《Micro, IEEE》1997,17(5):40-49

In the race to improve cache performance, many researchers have proposed schemes that increase a cache's associativity. The associativity of a cache is the number of places in the cache where a block may reside. In a direct-mapped cache, which has an associativity of 1, there is only one location to search for a match for each reference. In a cache with associativity n-an n-way set-associative cache-there are n locations. Increasing associativity reduces the miss rate by decreasing the number of conflict, or interference, references. The column-associative cache and the predictive sequential associative cache seem to have achieved near-optimal performance for an associativity of two. Increasing associativity beyond two, therefore, is one of the most important ways to further improve cache performance. We propose two schemes for implementing associativity greater than two: the sequential multicolumn cache, which is an extension of the column-associative cache, and the parallel multicolumn cache. For an associativity of four, they achieve the low miss rate of a four-way set-associative cache. Our simulation results show that both schemes can effectively reduce the average access time 相似文献

11.

多维数据的Z-Ordering存储映射算法及其缓存调度优化

侯昉陆寄远黄承慧《计算机工程与科学》2016,38(5):877-884

多维数据以线性形式在存储系统中进行访问操作,二维及以上维度空间中的相邻节点被不同的映射算法映射到一维空间的不相邻位置。高维空间中进行相邻节点访问时,其一维存储映射位置有着不同的访问距离和访问延迟。提出了基于空间填充曲线Z-Ordering的存储映射方法及其访问距离的度量指标,并和常规优先算法进行了对比,发现能更好地将高维相邻的数据节点簇集到一维存储位置,加强了局部性。调整缓存空间中用于预取的空间大小,可以利用增强的局部性,提高了缓存命中率。实验结果表明,改善了多维数据的访问速度,优化了系统性能。相似文献

12.

Dynamic Partitioning of Shared Cache Memory 总被引：6，自引：0，他引：6

G. E. Suh L. Rudolph S. Devadas 《The Journal of supercomputing》2004,28(1):7-26

This paper proposes dynamic cache partitioning amongst simultaneously executing processes/threads. We present a general partitioning scheme that can be applied to set-associative caches.Since memory reference characteristics of processes/threads can change over time, our method collects the cache miss characteristics of processes/threads at run-time. Also, the workload is determined at run-time by the operating system scheduler. Our scheme combines the information, and partitions the cache amongst the executing processes/threads. Partition sizes are varied dynamically to reduce the total number of misses.The partitioning scheme has been evaluated using a processor simulator modeling a two-processor CMP system. The results show that the scheme can improve the total IPC significantly over the standard least recently used (LRU) replacement policy. In a certain case, partitioning doubles the total IPC over standard LRU. Our results show that smart cache management and scheduling is essential to achieve high performance with shared cache memory. 相似文献

13.

IP路由缓存技术研究

朱国胜余少华徐宁《计算机研究与发展》2012,49(4):710-716

针对目前用于IP路由查找的地址缓存技术和前缀缓存技术的局限性,分析了骨干网路由表前缀重叠特征,提出了一种基于阈值的IP路由缓存方法,该方法结合了地址缓存和前缀缓存技术,无需进行前缀扩展,克服了地址缓存技术缓存空间要求过大、前缀缓存技术无法缓存内部前缀节点的问题,在缓存空间、缓存命中率、缓存公平性以及路由增量更新方面具有优势;仿真实验表明对于路由条目超过260000的路由表,缓存空间大小为30000,选择阈值K=4时97%以上的节点可实现1:1缓存,其余节点采用地址缓存,缓存失效率小于0.02,可以用小的缓存空间实现高速线速转发. 相似文献

14.

一种运用块级局部性的闪存缓存管理策略

龚剑峰李曦陈香兰朱宗卫贾刚勇《计算机系统应用》2013,22(7):177-182

闪存被广泛应用在电子产品的存储设备中, 针对闪存的研究也日益得到重视. 基于访问的局部性原理, 并结合闪存读写代价的差异性, 提出了一种针对闪存特点运用块级局部性原理的cache缓存管理算法LRU-BLL. 实验表明, 这种方法有效地提高了缓存的命中率, 并且减少了缓存的脏页回写次数和提高了缓冲区的平均换出长度. 相似文献

15.

低功耗高性能的分离比较cache方案

刘彬彭蔓蔓《计算机应用研究》2007,24(10):267-268,285

提出了一种基于分离比较cache的设计方法,其技术关键在于设计一个用来存储原标志低四位的全相联cache和分离标志比较器,以确保同时获得高性能和低能量损耗.SPEC95仿真结果表明,分离比较cache能够节省传统四路组相联cache13%的存取时间和45%～60%的能量损耗. 相似文献

16.

内容中心网络中基于多样化存储的缓存污染防御机制

郑林浩汤红波葛国栋《计算机应用》2015,35(6):1688-1692

针对内容中心网络(CCN)中的缓存污染攻击问题,提出一种基于多样化存储的缓存污染防御机制。对不同业务内容采取差异化缓存从而减小网络受攻击面,将业务划分为三类并采用不同缓存策略:对隐私及实时性业务不予缓存;对流媒体业务以概率推送至网络边缘缓存;对其他文件类内容业务由上游到边缘逐步推送缓存。在不同节点分别配置不同的缓存污染攻击防御手段:对于边缘节点通过内容请求到达概率的变化对攻击进行检测;对于上游节点通过设置过滤规则将请求概率较低的内容排除出缓存空间。仿真结果表明,相比CNN传统缓存策略下的防御效果,该机制使网络平均缓存命中率提高了17.3%,该机制能够有效提升网络对于缓存污染攻击的防御能力。相似文献

17.

代理服务器中基于对象的限定预取策略研究

任小波杨忠秀宋加涛《计算机工程与科学》2009,31(3)

预取作为一种主动缓存技术可用于提高缓存命中率,但其效果好坏很大程度上取决于所采用的预取策略。本文提出了一种代理服务器下基于对象的限定预取策略,通过调整代理服务器中预取空间的大小,防止无效页面占用过多的缓存空间,提高了缓存的利用率,从而获得较高的命中率。实验表明,基于对象的限定预取策略命中率远远高于LRU策略,并且相对于基于对象的LRU策略也有明显的改善。相似文献

18.

近阈值电压下可容错的一级缓存结构设计

程煜刘伟孙童心魏志刚杜薇《计算机科学》2020,47(4):42-49

随着硅的集成度和时钟频率的急剧提升,功耗和散热已成为体系结构设计中的关键挑战。近阈值电压技术是一种能够有效降低处理器能耗的有着广泛应用前景的技术。然而,在近阈值电压下,大量SRAM单元失效,导致一级缓存的错误率升升,给一级缓存的可靠性带来了严峻挑战。目前有很多学者通过牺牲缓存容量或者引入额外的延迟来纠正缓存的错误,但大多方法只能适应SRAM单元的低失效率环境,在高失效率的环境下表现较差。文中提出了一种基于传统6T SRAM的近阈值电压下可容错的一级缓存结构——FTFLC(Fault-Tolerant First-Level Cache),在高失效率的环境下,其表现出了更好的性能。FTFLC采用两级映射机制,利用块映射机制和位纠正机制分别对缓存行中有错的比特位和子数据块进行映射保护。此外,文中还提出了FTFLC初始化算法将两种映射机制结合,提高了可用的缓存容量。最后,使用gem5模拟器,在650 mV电压的高失效率环境下对FTFLC进行仿真实验,将其与3种已有缓存结构10T-Cache,Bit-fix,Correction Prediction进行对比。对比结果表明,FTFLC相比其他的缓存结构,在保持较低面积和能耗开销的同时,拥有至少3.86%的性能提升,且将L1 Cache的容量可用率提升了12.5%。相似文献

19.

基于谱聚类的Web多级缓存替换策略

刘露吴珏杨雷杨福军《计算机系统应用》2022,31(11):380-386

服务器缓存性能的核心是缓存替换策略,缓存替换策略直接影响缓存的命中率, Web缓存可以解决网络拥塞和用户访问延迟问题,提高服务器的性能.传统缓存替换算法的命中率往往不高,为此文中提出了一种基于谱聚类的多级缓存替换策略.该策略利用循环滑动窗口机制提取日志文件的多项时序特征和访问属性,通过谱聚类对过滤后的数据集进行聚类分析从而得到访问预测结果.多级缓存替换策略综合考虑了缓存对象的局部频率、全局频率以及资源大小能更好地对低价值资源进行剔除,同时对高价值资源进行保留.通过与传统替换算法LRU、LFU、RC、FIFO进行实验对比,实验结果表明本文将谱聚类和多级缓存替换策略进行结合有效地提高了缓存请求命中率和字节命中率. 相似文献

20.

Evolutionary Techniques for Web Caching

Athena Vakali 《Distributed and Parallel Databases》2002,11(1):93-116

Web caching has been proposed as an effective solution to the problems of network traffic and congestion, Web objects access and Web load balancing. This paper presents a model for optimizing Web cache content by applying either a genetic algorithm or an evolutionary programming scheme for Web cache content replacement. Three policies are proposed for each of the genetic algorithm and the evolutionary programming techniques, in relation to objects staleness factors and retrieval rates. A simulation model is developed and long term trace-driven simulation is used to experiment on the proposed techniques. The results indicate that all evolutionary techniques are beneficial to the cache replacement, compared to the conventional replacement applied in most Web cache server. Under an appropriate objective function the genetic algorithm has been proven to be the best of all approaches with respect to cache hit and byte hit ratios. 相似文献