期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

赵彩丁永林陈志坚《计算机应用研究》2016,33(2)

针对低电压下,Cache硬错误和软错误概率提高导致Cache不能正常工作的问题,提出了一种基于混合纠错码的Cache结构。该结构利用脏数据正确性必须由处理器中Cache保证而干净数据可由片外恢复的数据特征,将Cache分成多比特纠错码和单比特纠错码保护的两个区域。通过采用新的Cache替换策略,使得脏数据总处于多比特纠错码保护区域,保证其得到较强保护,从而保证Cache在低电压下的可靠性运行。基于EEMBC测试基准的实验结果表明,该设计可以在590mv电压下正常运行,与该领域最新研究 VS-ECC相比,降低了23.6%的纠错码存储信息量,性能提高5.9%。相似文献

2.

ELSS:一种降低数据Cache体转换能量的替换策略

下载免费PDF全文

周宏伟孙岩张民选《计算机工程与科学》2009,31(1)

随着工艺尺寸的缩小以及频率的增加,漏流能量将成为未来微处理器能量消耗的主要来源。其中,片上Cache存储结构将是整个处理器能量消耗的重要组成部分。为了降低漏流能量,组相联数据Cache中采用了分体的结构,通过使用位线隔离技术将那些未被访问的Cache存储体的位线进行隔离,使之进入低能耗状态。本文提出一种新的数据Cache替换策略——ELSS。该策略充分考虑到访问数据Cache的地址具有较好的空间局部性,特别增加了对数据地址序列中的跨步访问模式的识别,用于指导Cache块的替换。通过将符合顺序模式与跨步模式的数据块尽量放在同一个存储体中,可以减少存储体的转换次数。实验表明,使用ELSS替换策略可以进一步减少位线隔离数据Cache使用LRU策略时9%的体转换次数,多节省8%的数据Cache能量消耗,而对性能的影响比使用LRU策略时小。相似文献

3.

代码Cache分组管理策略研究

蒋海涛王铮谭猛《计算机工程与设计》2008,29(17)

代码Cache是动态优化系统的重要组成部分,利用代码Cache可以实现翻译代码的复用,利用软件管理代码Cache存储优化和代码翻译.代码Cache存储大小不等的超级块,超级块之间可能包含指向其它超级块的链接指针,因而会带来较高的替换开销.提出采用分组管理代码Cache的策略,该策略能够有效的平衡Cache管理的复杂性和Cache的失效率. 相似文献

4.

Cache自适应写分配策略 总被引：1，自引：0，他引：1

郇丹丹李祖松胡伟武刘志勇《计算机研究与发展》2007,44(2):348-354

处理器所能提供的有效带宽是目前制约处理器性能提高的关键因素 .通过对Cache写失效行为的分析,提出了一种新的提高处理器带宽利用率的Cache写失效处理策略--Cache自适应写分配策略 .该策略在访存失效队列中收集全修改Cache块,对全修改Cache块采用非写分配策略,并能够自适应地切换为写分配策略 .与传统的Cache写失效处理策略相比,Cache自适应写分配策略硬件代价小,避免了不必要的数据传输,降低Cache污染,减少存储管理队列阻塞的频率 .结果表明,采用Cache自适应写分配策略,STREAM基准测试程序带宽平均提高62.6%,SPEC CPU2000程序的IPC值平均提高5.9% . 相似文献

5.

相变存储器写寿命延长关键技术研究进展

张震付印金胡谷雨《计算机工程与科学》2018,40(9):1546-1555

随着大数据分析应用时效性提升和“存储墙”问题日益突出,存储系统已成为当前计算机系统整体性能的瓶颈。以相变存储器（PCM）为代表的新型非易失性存储器（NVM）具有集成度高、功耗低、读写访问速度高、非易失、体积小和抗震等优良特性,已成为最具潜力的下一代存储设备。然而,写寿命有限是PCM实用化的一道障碍,如何通过减少写操作和磨损均衡以提升PCM使用寿命是当前的研究热点。从减少PCM写操作、均匀写操作分布以及在混合内存中的页面迁移等三个方面介绍了当前PCM写寿命延长技术的研究现状以及优缺点,最后探讨未来进一步改进PCM寿命可能的研究方向。相似文献

6.

基于协作缓存的分布式VOD组播调度

白霜范学峰 《计算机应用研究》2005,22(4):231-233

分布式视频点播中的Cache控制机制是提高系统效率的核心技术,良好的缓存机制可以有效地减少用户的请求丢失率。提出了分布式VOD的一种新型层次化体系结构,采用两层Cache替换机制,将本地服务器机群所有节点内存连成一个全局的虚拟缓存,并给出视频文件基于该缓存的组播调度。相似文献

7.

基于动态Cache策略优化Snort检测引擎性能研究

张雪松《计算机应用与软件》2008,25(3):260-262

提出了一种动态Cache策略,将最近一段时间内经常用到的少量规则结点指针存储在一个Cache块中.当攻击密度上升到一定阈值时,在Snort检测引擎中动态加载Cache块,接下来捕获的每一个数据包都首先和Cache块中存储的指针所指向的规则结点进行匹配.当网络攻击密度降低到一定阈值时,在Snort检测引擎中动态卸载Cache块,避免攻击密度较低时二次匹配带来的额外开销.实验表明,动态Cache策略可以提高Snort检测引擎在高强度攻击下的检测效率,降低漏报率. 相似文献

8.

基于BKDRHash的混合内存损耗均衡算法研究

《微型机与应用》2017,(11)

相变存储器(PCM)是一种新型的非易失性存储器(NVM),与传统内存DRAM互有优势。基于DRAM和PCM的混合内存使得同时发挥DRAM与PCM各自的优势成为可能。然而,由于PCM写操作寿命有限,在设计混合内存的管理策略时,不仅要对混合内存体系结构进行设计,还需要设计一种损耗均衡算法对PCM写操作进行负载均衡优化。文中设计了一种损耗均衡算法,将写操作逻辑地址作为输入,使用BKDRHash函数对地址进行映射,实现PCM的损耗均衡。实验结果表明,文中提出的损耗均衡算法能够以很少的时延与功耗损失大幅提升PCM的使用寿命。相似文献

9.

CISC中混合Cache的优化设计 总被引：1，自引：1，他引：0

江喜平高德远张盛兵王晶《计算机工程与应用》2006,42(10):109-111

论文重点讨论CISC系统中混合Cache的Cache容量、块大小、相联度和替换策略等对Cache系统性能的影响,得到了一种混合Cache的优化方法。基于此方法,设计了“龙腾C1”CISC处理器中Cache单元,综合和流片结果表明该设计符合要求。相似文献

10.

一种新颖的软件可控Cache优化方法

杜红燕田兴彦田新华《计算机工程与应用》2005,41(21):52-57

由于Cache污染问题,传统的仅由硬件控制的Cache替换策略不能得到令人满意的Cache利用率。随着软件可控Cache机制的出现,编译器开始可以直接控制Cache替换,改善Cache行为。本文证明了一个Cache提示优化定理,并依该定理提出了一个由编译器辅助控制的Cache替换策略:最优Cache划分(OCP)。OCPCache替换策略简化了Cache行为和Cache失效分析方法。实验结果表明OCPCache替换策略能有效地降低Cache失效率。相似文献

11.

Reliability improvement in private non-uniform cache architecture using two enhanced structures for coherence protocols and replacement policies

Mohammad Maghsoudloo Hamid R. Zarandi 《Microprocessors and Microsystems》2014

In this paper, a comprehensive study is first conducted to investigate the effects of cache coherence protocols and cache replacement policies on the characteristics of NUCA in current many-core processors. The main focus of this study is to analyze the effects of coherence protocols and replacement policies on the vulnerability of caches. The outcomes of this analysis indicate two facts: (i) Differences in handling write operations play an important role to make distinction in favor of or against a cache coherence protocol; (ii) Near-optimal solutions for replacement problem, aimed at enhancing the performance, can also make positive influence on reduction of cache vulnerability factor. Based on the results of first step, two schemes are introduced to enhance the reliability of caches by applying some modification on the structures of cache coherence protocols and cache replacement policies. The first scheme tries to manage sharing of the dirty data items among different same-level caches. The second helps to give priority and more opportunity to old dirty blocks than clean blocks for replacement. The proposed schemes reveal about 18% improvement in MTTF, with negligible performance, bandwidth and energy consumption overhead compared to previous cache structures. 相似文献

12.

Evaluation of Compiler-Controlled Updating to Reduce Coherence-Miss Penalties in Shared-Memory Multiprocessors

《Journal of Parallel and Distributed Computing》1999,56(2):122-143

We consider in this paper the effectiveness of a new approach calledcompiler-controlledupdating to reduce coherence-miss penalties in shared-memory multiprocessors. A key part of the method is a compiler algorithm that identifies the last store instruction to a memory block in a flow graph using classic dataflow analysis techniques. Such stores are marked and replaced by update instructions that at run time make the memory copy clean. Whereas this static method shortens the read-miss latency for actively shared blocks, it can cause useless traffic for shared blocks that are effectively private. We therefore complement the static analysis with a dynamic simple heuristic in the cache coherence protocol aiming at classifying blocks as private or shared at run time. We evaluate the performance effects of compiler-controlled updating using six scientific parallel applications compiled by an optimizing compiler that incorporates our static analysis and then running them on a detailed CC-NUMA architectural simulation model. We have found that the compiler algorithm can convert between 83 and 100% of the dirty misses into clean misses. By adding the private/shared heuristic, the update traffic of private memory blocks can be practically eliminated. Overall, the static analysis in combination with the dynamic heuristic is shown to reduce the execution time by as much as 32%. 相似文献

13.

一种基于闪存的缓冲区管理算法

尚晓薇林奕《计算机与现代化》2013,(11):74-76,81

在大多数以磁盘为存储系统的操作系统中,缓冲区管理算法只考虑到了数据访问的命中率。然而,闪存的写操作代价远远大于读操作代价。为了提高闪存性能,本文提出的O CFLRU（Optimal CFLRU）算法对于CFLRU（Clean First LRU）算法做了优化。该算法用一种页块混合的数据结构来分别管理缓冲区中的干净页面和脏的数据页面聚簇。当缓冲区空间不够时,优先置换干净页面,再置换出脏的数据页聚簇,从而减少了写回的次数和随机写带来的擦除次数,提高了闪存的性能。相似文献

14.

Probabilistic Prediction of Temporal Locality

Etsion Y. Feitelson D.G. 《Computer Architecture Letters》2007,6(1):17-20

The increasing gap between processor and memory speeds, as well as the introduction of multi-core CPUs, have exacerbated the dependency of CPU performance on the memory subsystem. This trend motivates the search for more efficient caching mechanisms, enabling both faster service of frequently used blocks and decreased power consumption. In this paper we describe a novel, random sampling based predictor that can distinguish transient cache insertions from non-transient ones. We show that this predictor can identify a small set of data cache resident blocks that service most of the memory references, thus serving as a building block for new cache designs and block replacement policies. Although we only discuss the L1 data cache, we have found this predictor to be efficient also when handling L1 instruction caches and shared L2 caches. 相似文献

15.

PAM: an efficient power-aware multilevel cache policy to reduce energy consumption of storage systems

Xiaodong MENG Chentao WU Minyi GUO Long ZHENG Jingyu ZHANG 《Frontiers of Computer Science》2019,13(4):850

Energy consumption is one of the most significant aspects of large-scale storage systems where multilevel caches are widely used. In a typical hierarchical storage structure, upper-level storage serves as a cache for the lower level, forming a distributed multilevel cache system. In the past two decades, several classic LRU-based multilevel cache policies have been proposed to improve the overall I/O performance of storage systems. However, few power-aware multilevel cache policies focus on the storage devices in the bottom level, which consume more than 27% of the energy of the whole system [1]. To address this problem, we propose a novel power-aware multilevel cache (PAM) policy that can reduce the energy consumption of high-performance and I/O bandwidth storage devices. In our PAM policy, an appropriate number of cold dirty blocks in the upper level cache are identified and selected to flush directly to the storage devices, providing high probability extension of the lifetime of disks in standby mode. To demonstrate the effectiveness of our proposed policy, we conduct several simulations with real-world traces. Compared to existing popular cache schemes such as PALRU, PB-LRU, and Demote, PAM reduces power consumption by up to 15% under different I/O workloads, and improves energy efficiency by up to 50.5%. 相似文献

16.

一种运用块级局部性的闪存缓存管理策略

龚剑峰李曦陈香兰朱宗卫贾刚勇《计算机系统应用》2013,22(7):177-182

闪存被广泛应用在电子产品的存储设备中, 针对闪存的研究也日益得到重视. 基于访问的局部性原理, 并结合闪存读写代价的差异性, 提出了一种针对闪存特点运用块级局部性原理的cache缓存管理算法LRU-BLL. 实验表明, 这种方法有效地提高了缓存的命中率, 并且减少了缓存的脏页回写次数和提高了缓冲区的平均换出长度. 相似文献

17.

Analysis of the periodic update write policy for disk cache

Carson S.D. Setia S. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(1):44-54

A disk cache is typically used in file systems to reduce average access time for data storage and retrieval. The `periodic update' write policy, widely used in existing computer systems, is one in which dirty cache blocks are written to a disk on a periodic basis. The average response time for disk read requests when the periodic update write policy is used is determined. Read and write load, cache-hit ratio, and the disk scheduler's ability to reduce service time under load are incorporated in the analysis, leading to design criteria that can be used to decide among competing cache write policies. The main conclusion is that the bulk arrivals generated by the periodic update policy cause a traffic jam effect which results in severely degraded service. Effective use of the disk cache and disk scheduling can alleviate this problem, but only under a narrow range of operating conditions. Based on this conclusion, alternate write packages that retain the periodic update policy's advantages and provide uniformly better service are proposed 相似文献

18.

An Analytic Study of Caching in Computer Systems

Duane Buck Mukesh Singhal 《Journal of Parallel and Distributed Computing》1996,32(2):205

Cache performance is analyzed for the independent reference model (IRM) of data reference in conjunction with the least recently used (LRU) cache block replacement policy. The method has low computational complexity and high accuracy. The method computes the cache-hit rate, the probabilities of various numbers of references to a block while it is cache resident, the probabilities a block being replaced belongs to each memory segment (grouping by probability of access), the probability that a block being replaced was modified while in cache (assuming write-back policy), and the average number of references made to all blocks since the last reference to a block being replaced. The analysis is extended to address cache coherence in parallel and distributed systems. 相似文献

19.

Using eager strategies to improve NFS I/O performance

《International Journal of Parallel, Emergent and Distributed Systems》2013,28(2):134-158

Typical network file system (NFS) clients write lazily: they leave dirty pages in the page cache and defer writing to the server. This reduces network traffic when applications repeatedly modify the same set of pages. However, this approach can lead to memory pressure, when the number of available pages on the client system is so low that the system must work harder to reclaim dirty pages. We show that NFS performance is poor under memory pressure and present two mechanisms to solve it: eager writeback and eager page laundering. These mechanisms change the client's data management policy from lazy to eager, in which dirty pages are written back proactively, resulting in higher throughput for sequential writes. In addition, we show that NFS servers suffer from out-of-order file operations, which further reduce performance. We introduce request ordering, a server mechanism to process operations, as much as possible, in the order they were sent by the client, which improves read performance substantially. We have implemented these techniques in the Linux operating system. I/O performance is improved, with the most pronounced improvement visible for sequential access to large files. We see 33% improvement in the performance of streaming write workloads and more than triple the performance of streaming read workloads. We evaluate several non-sequential workloads and show that these techniques do not degrade performance, and can sometimes improve performance. 相似文献