期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance benefits of non-volatile caches in distributed file systems

P. Biswas D. Towsley K. K. Ramakrishnan C. M. Krishna 《Concurrency and Computation》1994,6(4):289-323

We study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. We propose the use of a small non-volatile cache for writes, at the client and the file server, together with a larger volatile read cache to keep the cost of the caches reasonable. We use a synthetic workload developed from analysis of file I/O traces from commercial production systems and use a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation. We show that non-volatile write caches at the clients and the file server reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. We examine the comparative benefits of two alternative writeback policies for the non-volatile write cache. We show that a proposed threshold based writeback policy is more effective than a periodic writeback policy under heavy load. We also investigate the effect of varying the write cache size and show that introducing a small non-volatile cache at the client in conjunction with a moderate sized non-volatile server write cache improves the write response time by a factor of four at all load levels. 相似文献

2.

Improving Metadata Caching Efficiency for Data Deduplication via In-RAM Metadata Utilization

下载免费PDF全文

Bing Zhou Jiang-Tao Wen 《计算机科学技术学报》2016,31(4):805-819

We describe a data deduplication system for backup storage of PC disk images, named in-RAM metadata utilizing deduplication (IR-MUD). In-RAM hash granularity adaptation and miniLZO based data compression are firstly proposed to reduce the in-RAM metadata size and thereby reduce the space overheads required by the in-RAM metadata caches. Secondly, an in-RAM metadata write cache, as opposed to the traditional metadata read cache, is proposed for further reducing metadata-related disk I/O operations and improving deduplication throughput. During deduplication, the metadata write cache is managed following the LRU caching policy. For each manifest that is hit in the metadata write cache, an expensive manifest reloading operation from the disk is avoided. After deduplication, all the manifests in the metadata write cache are cleared and stored on the disk. Our experimental results using 1.5 TB real-world disk image dataset show that 1) IR-MUD achieved about 95% size reduction for the deduplication metadata, with a small time overhead introduced, 2) when the metadata write cache was not utilized, with the same RAM space size for the metadata read cache, IR-MUD achieved a 400% higher RAM hit ratio and a 50% higher deduplication throughput, as compared with the classic Sparse Indexing deduplication system where no metadata utilization approaches are utilized, and 3) when the metadata write cache was utilized and enough RAM space was available, IR-MUD achieved a 500% higher RAM hit ratio compared with Sparse Indexing and a 70% higher deduplication throughput compared with IR-MUD with only a single metadata read cache. The in-RAM metadata harnessing and metadata write caching approaches of IR-MUD can be applied in most parallel deduplication systems for improving metadata caching efficiency. 相似文献

3.

Performance and reliability improvement by using asynchronous algorithms in disk buffer cache memory

Anna Haé 《Acta Informatica》1993,30(2):131-146

This paper proposes performance and reliability improvement by using new algorithms for asynchronous operations in disk buffer cache memory. These algorithms allow for writing the files into the buffer cache by the processes and consider the number of active processes in the system and the length of the queue to the disk buffer cache. Writing the contents of the buffer cache to the disk depends on the system load and the write activity. Performance and reliability measures including the elapsed time of writing a file into the buffer cache, the waiting time to start writing a file, and the mean number of blocks written to the disk between system failures are used to show performance and reliability improvement by using the algorithms. Sensitivity analysis is used to influence the algorithms' design. Examples of real systems are used to show the numerical results of performance and reliability improvement in different systems with various disk cache parameters and file sizes. 相似文献

4.

基于重用信息的非易失性缓存动态旁路策略

焦童陈玲玲安鑫李建华《计算机工程》2021,47(4):158-165

非易失性存储器具有能耗低、可扩展性强和存储密度大等优势,可替代传统静态随机存取存储器作为片上缓存,但其写操作的能耗及延迟较高,在大规模应用前需优化写性能。提出一种基于缓存块重用信息的动态旁路策略,用于优化非易失性存储器的缓存性能。分析测试程序访问最后一级缓存（LLC）时的重用特征,根据缓存块的重用信息动态预测相应的写操作是否绕过非易失性缓存,利用预测表进行旁路操作完成LLC缺失时的填充,同时采用动态路径选择进行上级缓存写回操作,通过监控模块为旁路的缓存块选择合适的上级缓存,并将重用计数较高的缓存块填充其中以减少LLC写操作次数。实验结果表明,与未采用旁路策略的缓存设计相比,该策略使4核处理器中所有SPLASH-2程序的运行时间平均减少6.6%,缓存能耗平均降低22.5%,有效提高了整体缓存性能。相似文献

5.

Cache自适应写分配策略 总被引：1，自引：0，他引：1

郇丹丹李祖松胡伟武刘志勇《计算机研究与发展》2007,44(2):348-354

处理器所能提供的有效带宽是目前制约处理器性能提高的关键因素 .通过对Cache写失效行为的分析,提出了一种新的提高处理器带宽利用率的Cache写失效处理策略--Cache自适应写分配策略 .该策略在访存失效队列中收集全修改Cache块,对全修改Cache块采用非写分配策略,并能够自适应地切换为写分配策略 .与传统的Cache写失效处理策略相比,Cache自适应写分配策略硬件代价小,避免了不必要的数据传输,降低Cache污染,减少存储管理队列阻塞的频率 .结果表明,采用Cache自适应写分配策略,STREAM基准测试程序带宽平均提高62.6%,SPEC CPU2000程序的IPC值平均提高5.9% . 相似文献

6.

Multiple prefetch adaptive disk caching 总被引：1，自引：0，他引：1

Grimsrud K.S. Archibald J.K. Nelson B.E. 《Knowledge and Data Engineering, IEEE Transactions on》1993,5(1):88-103

A new disk caching algorithm is presented that uses an adaptive prefetching scheme to reduce the average service time for disk references. Unlike schemes which simply prefetch the next sector or group of sectors, this method maintains information about the order of past disk accesses which is used to accurately predict future access sequences. The range of parameters of this scheme is explored, and its performance is evaluated through trace-driven simulation, using traces obtained from three different UNIX minicomputers. Unlike disk trace data previously described in the literature, the traces used include time stamps for each reference. With this timing information-essential for evaluating any prefetching scheme-it is shown that a cache with the adaptive prefetching mechanism can reduce the average time to service a disk request by a factor of up to three, relative to an identical disk cache without prefetching 相似文献

7.

A high performance NAND array file system based on multiple NAND flash memories

Sang Oh Park Yang Sun Lee Sung Jo Kim 《The Journal of supercomputing》2013,64(2):492-506

The existing NAND flash memory file systems have not taken into account multiple NAND flash memories for large-capacity storage. In addition, since large-capacity NAND flash memory is much more expensive than the same capacity hard disk drive, it is cost wise infeasible to build large-capacity flash drives. To resolve these problems, this paper suggests a new file system called NAFS for large-capacity storage with multiple small-capacity and low-cost NAND flash memories. It adopts a new cache policy, mount scheme, and garbage collection scheme in order to improve read and write performance, to reduce the mount time, and to improve the wear-leveling effectiveness. Our performance results show that NAFS is more suitable for large-capacity storage than conventional NAND file systems such as YAFFS2 and JFFS2 and a disk-based file system for Linux such as HDD-RAID5-EXT3 in terms of the read and write transfer rate using a double cache policy and the mount time using metadata stored on a separate partition. We also demonstrate that the wear-leveling effectiveness of NAFS can be improved by our adaptive garbage collection scheme. 相似文献

8.

Deconstructing on-board disk cache by using block-level real traces

Yuhui Deng Jipeng ZhouXiaohua Meng 《Simulation Modelling Practice and Theory》2012,20(1):33-45

On-board disk cache is an effective approach to improve disk performance by reducing the number of physical accesses to the magnetic media. Disk drive manufacturers are increasing the on-board disk cache size to match the capacity growth of the backend magnetic media. Some disk drives nowadays have a cache of 32 MB. Modern computer systems use large amounts of memory to improve performance, any data brought into host memory will be re-accessed there, not in the on-board disk cache. This feature has a significant impact on the behavior of disk cache. This is because computer systems are complex systems consisting of various components. The components are correlated with each other. Therefore, a specific component cannot be isolated from the overall system when we analyze its performance behavior. This paper employs four block-level real traces to explore the performance behavior of the on-board disk cache by considering the impacts of the cache hierarchy contained in computer systems. The analysis gives three major implications: (1) I/O stream at block-level contains negligible temporal locality. Therefore, read/write cache can only achieve marginal benefits. (2) Static write cache does not achieve performance gains since the write stream does not have too much interference with the read stream. Therefore, it is better to leave the on-board disk cache shared by both the write and read streams. (3) Read cache dominates the contribution to the hit ratio besides prefetch. Thus, it is better to focus on improving the read performance rather than write performance of disk cache. 相似文献

9.

Performance Comparison of Mirrored Disk Scheduling Methods with a Shared Non-Volatile Cache

A.?Thomasian Email author C.?Liu 《Distributed and Parallel Databases》2005,18(3):253-281

Mirrored disks or RAID1 is a popular disk array paradigm, which in addition to fault-tolerance, doubles the data access bandwidth. This is important in view of rapidly increasing disk capacities and the slow improvement in disk access time. Caching of dirty data blocks in a non-volatile storage (NVS) cache allows the destaging of dirty blocks to be deferrable, so as to improve the response time of read requests by giving them a higher priority than write requests. Destaging of dirty blocks in batches to take advantage of disk geometry entails in lowered disk utilization due to writes and improved performance for reads. Polyzois et al. [12] propose a scheduling policy for mirrored disks equipped with an NVS cache, so that one disk processes read requests, while the other disk is processing a write batch according to the CSCAN policy. We propose an improved scheduling policy as follows: (i) eliminating the forced idleness caused by the batch processing paradigm for write requests, i.e., allowing write requests to be processed individually; (ii) using SATF or even an exhaustive search, to reduce destaging time compared to CSCAN; (iii) introducing a threshold for the number of read requests, which when exceeded defers the destaging of dirty blocks. We compare these two scheduling policies with each other and also against prioritizing the processing of reads versus writes: (i) the head-of-the-line (HOL) priority queueing discipline, (ii) SATF with conditional priorities. It follows from simulation results that the new method outperforms Polyzois' method, which is even outperformed by the HOL priority policy. SATF with conditional priorities slightly outperforms the proposed method from the viewpoint of its throughput and response time, but is susceptible to more variability in response time. Recommended by: Ahmed Elmagarmid 相似文献

10.

Using Write Caches to Improve Performance of Cache Coherence Protocols in Shared-Memory Multiprocessors

Dahlgren F. Stenstrom P. 《Journal of Parallel and Distributed Computing》1995,26(2)

Write-invalidate protocols suffer from memory-access penalties due to coherence misses. While write-update or hybrid update/invalidate protocols can reduce coherence misses, the update traffic can increase memory-system contention. We show in this paper that update-based cache protocols can perform significantly better than write-invalidate protocols by incorporating a write cache in each processing node. Because it is legal to delay the propagation of modifications of a block until the next synchronization under relaxed memory consistency models, a write cache can significantly reduce traffic by exploiting locality in write accesses. By concentrating on a cache-coherent NUMA architecture, we study the implementation aspects of augmenting a write-invalidate, a write-update and two hybrid update/invalidate protocols with write caches. Through detailed architectural simulations using five benchmark programs, we find that write caches, with only a few blocks each, help write-invalidate protocols to cut the false-sharing miss rate and hybrid update/invalidate protocols to keep other copies, including the memory copy, clean at an acceptable write traffic level. Overall, the memory-access penalty associated with coherence misses is drastically reduced. 相似文献

11.

虚拟化环境下面向多目标优化的自适应SSD缓存系统

唐震吴恒王伟魏峻黄涛《软件学报》2017,28(8):1982-1998

以SSD为代表的新型存储介质在虚拟化环境下得到了广泛的应用,通常作为虚拟机读写缓存,起到优化磁盘IO性能的作用.已有研究往往关注SSD缓存的容量规划,依据缓存读写命中率评价SSD缓存分配效果,未能充分考虑SSD的服务能力上限,难以适用于典型的分布式应用场景,存在虚拟机抢占SSD缓存资源,导致虚拟机中应用性能违约的可能.本文实现了虚拟化环境下面向多目标优化的自适应SSD缓存系统,考虑了SSD的服务能力上限.基于自适应闭环实现对虚拟机和应用状态的动态感知.动态检测局部SSD缓存抢占状态,基于聚类方法生成虚拟机的优化放置方案,依据全局SSD缓存供给能力确定虚拟机迁移顺序和时机.实验结果表明该方法在应对典型分布式应用场景时可以有效缓解SSD缓存资源的争用,同时满足应用对虚拟机放置的需求,提升应用的性能并兼顾应用的可靠性.在Hadoop应用场景下,平均降低了25%的任务执行时间,对IO密集型应用平均提升39%的吞吐率.在ZooKeeper应用场景下,以不到5%的性能损失为代价应对了虚拟化主机的单点失效带来的虚拟机宕机问题. 相似文献

12.

Performance evaluation of an optimal cache replacement policy for wireless data dissemination 总被引：3，自引：0，他引：3

Jianliang Xu Qinglong Hu Wang-Chien Lee Dik Lun Lee 《Knowledge and Data Engineering, IEEE Transactions on》2004,16(1):125-139

Data caching at mobile clients is an important technique for improving the performance of wireless data dissemination systems. However, variable data sizes, data updates, limited client resources, and frequent client disconnections make cache management a challenge. We propose a gain-based cache replacement policy, Min-SAUD, for wireless data dissemination when cache consistency must be enforced before a cached item is used. Min-SAUD considers several factors that affect cache performance, namely, access probability, update frequency, data size, retrieval delay, and cache validation cost. The paper employs stretch as the major performance metric since it accounts for the data service time and, thus, is fair when items have different sizes. We prove that Min-SAUD achieves optimal stretch under some standard assumptions. Moreover, a series of simulation experiments have been conducted to thoroughly evaluate the performance of Min-SAUD under various system configurations. The simulation results show that, in most cases, the Min-SAUD replacement policy substantially outperforms two existing policies, namely, LRU and SAIU. 相似文献

13.

Recovery analysis of data sharing systems under deferred dirty pagepropagation policies

Dan A. Yu P.S. Jhingran A. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(7):695-711

In a multinode data sharing environment, different buffer coherency control schemes based on various lock retention mechanisms can be designed to exploit the concept of deferring the propagation or writing of dirty pages to disk to improve normal performance. Two types of deferred write policies are considered. One policy only propagates dirty pages to disk at the times when dirty pages are flushed out of the buffer under LRU buffer replacement. The other policy also performs writes at the times when dirty pages are transferred across nodes. The dirty page propagation policy can have significant implications on the database recovery time. In this paper, we provide an analytical modeling framework for the analysis of the recovery times under the two deferred write policies. We demonstrate how these policies can be mapped onto a unified analytic modeling framework. The main challenge in the analysis is to obtain the pending update count distribution which can be used to determine the average numbers of log records and data I/Os needed to be applied during recovery. The analysis goes beyond previous work on modeling buffer hit probability in a data sharing system where only the average buffer composition, not the distribution, needs to be estimated, and recovery analysis in a single node environment where the complexities on tracking the propagation of dirty pages across nodes and the buffer invalidation effect do not appear 相似文献

14.

一种基于RAID5的Disk Cache的实现

王湘娜蒋本珊徐渐《计算机应用》2004,24(3):106-109

介绍了一种基于RAID5的Disk Cache的实现。在对磁盘阵列Cache的实现过程中，使用了组相联映射、LRU替换算法等比较成熟的技术，在Cache回写策略上采用write-back方式。从而提高了写磁盘速度，减少冗余写盘操作。另外通过对校验组加锁，有效地防止了同一校验组里多个块同时降级而导致的数据不一致现象。相似文献

15.

A high-performance and endurable SSD cache for parity-based RAID

Chu LI Dan FENG Yu HUA Fang WANG 《Frontiers of Computer Science》2019,13(1):16

Solid-state drives (SSDs) have been widely used as caching tier for disk-based RAID systems to speed up dataintensive applications. However, traditional cache schemes fail to effectively boost the parity-based RAID storage systems (e.g., RAID-5/6), which have poor random write performance due to the small-write problem. What’s worse, intensive cache writes can wear out the SSD quickly, which causes performance degradation and cost increment. In this article, we present the design and implementation of KDD, an efficient SSD-based caching system which Keeps Data and Deltas in SSD. When write requests hit in the cache, KDD dispatches the data to the RAID storage without updating the parity blocks to mitigate the small write penalty, and compactly stores the compressed deltas in SSD to reduce the cache write traffic while guaranteeing reliability in case of disk failures. In addition, KDD organizes the metadata partition on SSD as a circular log to make the cache persistent with low overhead.We evaluate the performance of KDD via both simulations and prototype implementations. Experimental results show that KDD effectively reduces the small write penalty while extending the lifetime of the SSD-based cache by up to 6.85 times. 相似文献

16.

Leach: an automatic learning cache for inline primary deduplication system

Bin LIN Shanshan LI Xiangke LIAO Jing ZHANG Xiaodong LIU 《Frontiers of Computer Science》2014,8(2):175-183

Deduplication technology has been increasingly used to reduce storage costs. Though it has been successfully applied to backup and archival systems, existing techniques can hardly be deployed in primary storage systems due to the associated latency cost of detecting duplicated data, where every unit has to be checked against a substantially large fingerprint index before it is written. In this paper we introduce Leach, for inline primary storage, a self-learning in-memory fingerprints cache to reduce the writing cost in deduplication system. Leach is motivated by the characteristics of real-world I/O workloads: highly data skew exist in the access patterns of duplicated data. Leach adopts a splay tree to organize the on-disk fingerprint index, automatically learns the access patterns and maintains hot working sets in cachememory, with a goal to service a majority of duplicated data detection. Leveraging the working set property, Leach provides optimization to reduce the cost of splay operations on the fingerprint index and cache updates. In comprehensive experiments on several real-world datasets, Leach outperforms conventional LRU (least recently used) cache policy by reducing the number of cache misses, and significantly improves write performance without great impact to cache hits. 相似文献

17.

ARC-H: Adaptive replacement cache management for heterogeneous storage devices

Young-Jin Kim Jihong Kim 《Journal of Systems Architecture》2012,58(2):86-97

Heterogeneous storage architectures combine the strengths of different storage devices in a synergistically useful fashion, and are increasingly being used in mobile storage systems. In this paper, we propose ARC-H, an adaptive cache replacement algorithm for heterogeneous storage systems consisting of a hard disk and a NAND flash memory. ARC-H employs a dynamically adaptive management policy based on ghost buffers and takes account of recency, I/O cost per device, and workload patterns in making cache replacement decisions. Realistic trace-driven simulations show that ARC-H reduces service time by up to 88% compared with existing caching algorithms with a 20 Mb cache. ARC-H also reduces energy consumption by up to 81%. 相似文献

18.

Optimal positioning of read/write heads in mirrored disks

《Location Science #》1995,3(2):125-132

This work is concerned with finding the expected-travel-distance-minimizing anticipatory positions of disk arms in mirrored disk systems. In such systems, data is duplicated across two or more disk drives. A ‘read’ request may choose to read from any copy, and thus do so from the disk whose arm is closest to the request location. Since a ‘write’ must update all copies, the response time for such a request will depend on the distance of the arm which is furthest away from the request's location. Some problems of optimally positioning emergency service units on a line and of positioning idle elevators can be viewed mathematically as a special case of the mirrored disks scenario in which there are ‘read’ requests only. We show that, for any request location distribution, if there are more write than read requests then both arms should be located as if read requests did not exist — both at the median of the distribution. For situations where most requests are of ‘read’ type, we derive necessary conditions for optimal locations. 相似文献

19.

一种基于磁盘介质的网络存储系统缓存

尹洋刘振军许鲁《软件学报》2009,20(10):2752-2765

随着计算规模越来越大,网络存储系统应用领域越来越广泛,对网络存储系统I/O性能要求也越来越高.在存储系统高负载的情况下,采用低速介质在客户机和网络存储系统的I/O路径上作为数据缓存也变得具有实际的意义.设计并实现了一种基于磁盘介质的存储系统块一级的缓存原型D-Cache.采用两级结构对磁盘缓存进行管理,并提出了相应的基于块一级的两级缓存管理算法.该管理算法有效地解决了因磁盘介质响应速度慢而带来的磁盘缓存管理难题,并通过位图的使用消除了磁盘缓存写Miss时的Copy on Write开销.原型系统的测试结果表明,在存储服务器高负载的情况下,缓存系统能够有效地提高系统的整体性能. 相似文献

20.

A Statistical Admission Control Scheme for Continuous Media Servers Using Caching

Jin B. Kwon Heon Y. Yeom 《Multimedia Tools and Applications》2003,19(3):279-296

In continuous media servers, disk load can be reduced by using buffer cache. In order to utilize the saved disk bandwidth by caching, a continuous media server must employ an admission control scheme to decide whether a new client can be admitted for service without violating the requirements of clients already being serviced. A scheme providing deterministic QoS guarantees in servers using caching has already been proposed. Since, however, deterministic admission control is based on the worst case assumption, it causes the wastage of the system resources. If we can exactly predict the future available disk bandwidth, both high disk utilization and hiccup-free service are achievable. However, as the caching effect is not analytically determined, it is difficult to predict the disk load without substantial computation overhead. In this paper, we propose a statistical admission control scheme for continuous media servers where caching is used to reduce disk load. This scheme improves disk utilization and allows more streams to be serviced while maintaining near-deterministic service. The scheme, called Shortsighted Prediction Admission Control (SPAC), combines exact prediction through on-line simulation and statistical estimation using a probabilistic model of future disk load in order to reduce computation overhead. It thereby exploits the variation in disk load induced by VBR-encoded objects and the decrease in client load by caching. Through trace-driven simulations, it is demonstrated that the scheme provides near-deterministic QoS and keeps disk utilization high. 相似文献