首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Solid-state drives (SSDs) have been widely used as caching tier for disk-based RAID systems to speed up dataintensive applications. However, traditional cache schemes fail to effectively boost the parity-based RAID storage systems (e.g., RAID-5/6), which have poor random write performance due to the small-write problem. What’s worse, intensive cache writes can wear out the SSD quickly, which causes performance degradation and cost increment. In this article, we present the design and implementation of KDD, an efficient SSD-based caching system which Keeps Data and Deltas in SSD. When write requests hit in the cache, KDD dispatches the data to the RAID storage without updating the parity blocks to mitigate the small write penalty, and compactly stores the compressed deltas in SSD to reduce the cache write traffic while guaranteeing reliability in case of disk failures. In addition, KDD organizes the metadata partition on SSD as a circular log to make the cache persistent with low overhead.We evaluate the performance of KDD via both simulations and prototype implementations. Experimental results show that KDD effectively reduces the small write penalty while extending the lifetime of the SSD-based cache by up to 6.85 times.  相似文献   

2.
Distributed sparing is a method to improve the performance of RAID5 disk arrays with respect to a dedicated sparing system with N+2 disks (including the spare disk), since it utilizes the bandwidth of all N+2 disks. We analyze the performance of RAID5 with distributed sparing in normal mode, degraded mode, and rebuild mode in an OLTP environment, which implies small reads and writes. The analysis in normal mode uses an M/G/1 queuing model, which takes into account the components of disk service time. In degraded mode, a low-cost approximate method is developed to estimate the mean response time of fork-join requests resulting from accesses to recreate lost data on the failed disk. Rebuild mode performance is analyzed by considering an M/G/1 vacationing server model with multiple vacations of different types to take into account differences in processing requirements for reading the first and subsequent tracks. An iterative solution method is used to estimate the mean response time of disk requests, as well as the time to read each disk, which is shown to be quite accurate through validation against simulation results. We next compare RAID5 performance in a system (1) without a cache; (2) with a cache; and (3) with a nonvolatile storage (NVS) cache. The last configuration, in addition to improved read response time due to cache hits, provides a fast-write capability, such that dirty blocks can be destaged asynchronously and at a lower priority than read requests, resulting in an improvement in read response time. The small write penalty is also reduced due to the possibility of repeated writes to dirty blocks in the cache and by taking advantage of disk geometry to efficiently destage multiple blocks at a time  相似文献   

3.
吴坤尧  柴云鹏  张大方  王鑫 《软件学报》2022,33(12):4851-4868
近年来,传统磁记录的存储密度增长已经达到极限,为了满足快速增长的数据容量需求,多种新型存储技术不断涌现,其中瓦记录(shingledmagneticrecording,SMR)技术已实现商业化,在企业实际应用.由于瓦记录磁盘的叠瓦式结构,磁盘在随机写入时会引起写放大,造成磁盘性能下降.这一问题在部署传统的高可靠存储方案(如RAID5)时会变得更加严重,原因在于校验数据更新频率很高,磁盘内出现大量的随机写请求.研究发现瓦记录内部其实存在具有原位更新能力的“可覆盖写磁道(freetrack)”,基于“可覆盖写磁道”,提出了一种专门针对瓦记录盘的高可靠数据存储方法——FT-RAID,以替代经典的RAID5方法,实现一种廉价、大容量、高可靠的存储系统.FT-RAID包含两个部分:“可覆盖写磁道映射(FT-mapping)”和“可覆盖写磁道缓冲区(FT-buffer)”.FT-mapping实现了一种瓦记录友好的RAID映射方式,将频繁更新的校验块数据映射至“可覆盖写磁道”; FT-buffer实现了一种瓦记录友好的两层缓冲区结构,上层确保了热数据能够原位更新,下层提高了缓冲区的容量.基于真实企...  相似文献   

4.
为了满足指数级增长的大数据存储需求,现代的分布式存储系统需要提供大容量的存储空间以及快速的存储服务.因此在主流的分布式存储系统中,均应用了纠删码技术以节约数据中心的磁盘成本,保证数据的可靠性,并且满足应用程序和客户端的快速存储需求.在实际应用中数据往往重要程度并不相同,对数据可用性要求不一,且不同磁盘的故障率和可靠性动态不一的特点,对于传统RAID存储方式包括基于纠删码的存储系统提出了新的挑战.本文提出了一种面向数据可用性和磁盘可靠性动态要求的灵活自适应纠删码存储设计On-demand ARECS(On-demand Availability and Reliability Oriented Adaptive Erasure Coded Storage System),根据存储后端数据可用性和磁盘可靠性的多个维度进行设计,综合确定纠删码编码策略和存储节点选择,从而减少存储冗余度和存储延迟,同时提高数据可用性和存储可靠性.我们在Tahoe-LAFS开源分布式文件系统中进行了实验,实验结果验证了我们的理论分析,在保证具有多样性要求的数据可用性和磁盘可靠性的前提下,明显减少了数据冗余度和存储延迟.  相似文献   

5.
Redundant arrays of independent disks (RAID) provide an efficient stable storage system for parallel access and fault tolerance. The most common fault tolerant RAID architecture is RAID-1 or RAID-5. The disadvantage of RAID-1 lies in excessive redundancy, while the write performance of RAID-5 is only 1/4 of that of RAID-0. In this paper, we propose a high performance and highly reliable disk array architecture, called stripped mirroring disk array (SMDA). It is a new solution to the small-write problem for disk array. SMDA stores the original data in two ways, one on a single disk and the other on a plurality of disks in RAID-0 by stripping. The reliability of the system is as good as RAID-1, but with a high throughput approaching that of RAID-0. Because SMDA omits the parity generation procedure when writing new data, it avoids the write performance loss often experienced in RAID-5.  相似文献   

6.
视频监控、备份、归档等应用具有独特的负载特性和I/O访问模式,需研究特定的存储节能方法.磁盘阵列的局部并行策略有利于实现该类存储系统的节能,但通常会导致RAID执行小写操作而严重影响性能.为此,提出一种面向该类存储系统的高效能盘阵——Ripple-RAID,采用新的局部并行数据布局,通过综合运用地址转换、异地更新、基于流水技术渐进生成校验、分段数据恢复等策略,在单盘容错条件下,保持了局部并行的节能性,又有效解决了局部并行带来的小写问题.Ripple-RAID具有突出的性能和节能效率,在80%顺序写负载情况下,请求长度为512KB时,写性能为S-RAID 5的3.9倍,Hibernator、MAID写性能的1.9倍,PARAID、eRAID 5写性能的0.49倍;而比S-RAID 5节能20%,比Hibernator、MAID节能33%,比eRAID 5节能70%,比PARAID节能72%.  相似文献   

7.
Redundant array of independent SSDs (RAIS) is generally based on the traditional RAID design and implementation. The random small write problem is a serious challenge of RAIS. Random small writes in parity-based RAIS systems generate significantly more pre-reads and writes which can degrade RAIS performance and shorten SSD lifetime. In order to overcome the well-known write-penalty problem in the parity-based RAID5 storage systems, several logging techniques such as Parity Logging and Data Logging have been put forward. However, these techniques are originally based on mechanical characteristics of the HDDs, which ignore the properties of the flash memory. In this article, we firstly propose RAISL, a flash-aware logging method that improves the small write performance of RAIS storage systems. RAISL writes new data instead of new data and pre-read data to the log SSD by making full use of the invalid pages on the SSD of RAIS. RAISL does not need to perform the pre-read operations so that the original characteristics of workloads are kept. Secondly, we propose AGCRL on the basis of RAISL to further boost performance. AGCRL combines RAISL with access characteristic to guide read and write cost regulation to improve the performance of RAIS storage systems. Our experiments demonstrate that the RAISL significantly improves write performance and AGCRL improves both of write performance and read performance. AGCRL on average outperforms RAIS5 and RAISL by 39.15% and 16.59% respectively.  相似文献   

8.
在磁盘阵列模型中,关键是如何实现容许多个磁盘阵列故障使得系统性能达到最优。该文提出了一类新的纠双错编码――V码,使用该编码的磁盘阵列数据布局,阵列的盘数可以为偶数,校验信息均匀分散在阵列每个盘中,容许任意2个磁盘故障。与其它纠双码的磁盘阵列布局进行比较,当阵列盘数为偶数时,V码阵列布局具有最优性能,编译码复杂度、冗余率达到最低以及小写性能最优,利于解决磁盘阵列I/O问题。  相似文献   

9.
李松涛  金欣 《计算机应用》2014,34(10):2800-2805
为了保证云存储系统数据的高可用性、降低数据存储成本和带宽成本、缩短数据对象的访问时间,提出一种称为缓存大小自适应确定(CAROM)的新方案。CAROM结合传统的基于缓存策略的方法和纠错码方法来提高云文件系统的弹性和效率。另外,为了在缓存大小及其效益间实现平衡,提出一种基于总体成本凸函数特性的自适应方法来实现缓存大小的自适应选择。在基于现实世界文件系统数据的性能评估中,CAROM方案的存储成本和带宽成本分别比复制策略和纠错码策略下降60%和43%,同时访问延时与复制策略相当。结果表明,CAROM方案在支持当前云文件系统语义一致性的同时,兼具带宽成本低、存储成本低和访问成本低等特性。  相似文献   

10.
云文件系统凭借高性能、高扩展、高可用、易管理等特点,成为云存储和大数据的基础和核心。云文件系统一般采用完全副本技术来提升容错能力,提高数据资源的使用效率和系统性能。但完全副本的存储开销随着副本数目的增加呈线性增长,存储副本时造成额外的写带宽和数据管理开销。纠删码在没有增加过量的存储空间的基础上,通过合理的冗余编码来保证数据的高可靠性和可用性。研究了纠删码技术在云文件系统中的应用,从纠删码类型、编码对象、编码时机、数据更改、数据访问方式和数据访问性能等六个方面,对云文件系统中纠删码的设计进行了探究,以增强云文件系统的存储模型。在此基础上,设计并实现了纠删码原型系统,并通过实验证明了纠删码能有效地保障云文件系统的数据可用性,并且节省存储空间。  相似文献   

11.
Mirrored disks or RAID1 is a popular disk array paradigm, which in addition to fault-tolerance, doubles the data access bandwidth. This is important in view of rapidly increasing disk capacities and the slow improvement in disk access time. Caching of dirty data blocks in a non-volatile storage (NVS) cache allows the destaging of dirty blocks to be deferrable, so as to improve the response time of read requests by giving them a higher priority than write requests. Destaging of dirty blocks in batches to take advantage of disk geometry entails in lowered disk utilization due to writes and improved performance for reads. Polyzois et al. [12] propose a scheduling policy for mirrored disks equipped with an NVS cache, so that one disk processes read requests, while the other disk is processing a write batch according to the CSCAN policy. We propose an improved scheduling policy as follows: (i) eliminating the forced idleness caused by the batch processing paradigm for write requests, i.e., allowing write requests to be processed individually; (ii) using SATF or even an exhaustive search, to reduce destaging time compared to CSCAN; (iii) introducing a threshold for the number of read requests, which when exceeded defers the destaging of dirty blocks. We compare these two scheduling policies with each other and also against prioritizing the processing of reads versus writes: (i) the head-of-the-line (HOL) priority queueing discipline, (ii) SATF with conditional priorities. It follows from simulation results that the new method outperforms Polyzois' method, which is even outperformed by the HOL priority policy. SATF with conditional priorities slightly outperforms the proposed method from the viewpoint of its throughput and response time, but is susceptible to more variability in response time. Recommended by: Ahmed Elmagarmid  相似文献   

12.
纠错码拜占庭容错Quorum中错误检测机制   总被引:3,自引:0,他引:3  
摘要在大规模存储系统中,拜占庭存储节点的容错显得越来越重要。传统拜占庭Quorum通过复制可以容忍拜占庭失效,但是它们有两个主要缺点:低的存储空间利用率和静态quorum参数。我们提出纠错码拜占庭容错Quorum(Erasure-code Byzantine Fault-tolerance Quorum, E-BFQ),E-BFQ采用纠错码作为冗余策略,可以提供高可靠性,同时比复制占用更少存储空间。通过客户端读/写操作和管理器诊断操作,E-BFQ可以检测拜占庭节点,动态调整系统规模和故障闽值。结果显示本文方法可以达到动态调整的目的。  相似文献   

13.
I/O系统软件栈是影响NVM存储系统性能的重要因素。针对NVM存储系统的读写速度不均衡、写寿命有限等问题,设计了同异步融合的访问请求管理策略;在使用异步策略管理数据量较大的写操作的同时,仍然使用同步策略管理读请求和少量数据的写请求。针对多核处理器环境下不同计算核心访问存储系统时地址转换开销大的问题,设计了面向多核处理器地址转换缓存策略,减少地址转换的时间开销。最后实现了支持高并发访问NVM存储系统(CNVMS)的原型,并使用通用测试工具进行了随机读写、顺序读写、混合读写和实际应用负载的测试。实验结果表明,与PMBD相比,所提策略能提高1%~22%的读写速度和9%~15%的IOPS,验证了CNVMS策略能有效提高NVM存储系统的I/O性能和访问请求处理速度。  相似文献   

14.
一种虚拟的非易失性层次Cache的设计和实现   总被引:1,自引:0,他引:1  
为解决磁盘存储中的小写问题,本文研究了一种新的层次Cache结构。这种层次Cache使用虚存页面文件和部分系统RAM组成二级Cache结构,能很好地利用磁盘访问在大/小写以及随机/顺序访问时的巨大性能差异。经过Ntiogen和Mailbench基准测试程序测试表明,这种层次Cache可以提升I/O子系统处理突发密集小写的性能。  相似文献   

15.
一种新型的能够防止两块磁盘失败的技术   总被引:3,自引:0,他引:3  
海量存储系统的建设是目前计算机系统最热门和发展最快的领域,存储系统的主要部分是在线存储系统。RAID(磁盘阵列)对于提升存储系统的效率、数据的高可靠性、防止数据破坏和业务停顿具有重大意义。目前实际应用中的RAID 1,RAID 0+1,RAID 4,RAID 5都只能防止单块磁盘的损坏,实际生产中已经出现了很多由于双盘损坏造成业务长时间停顿的事故。在介绍了通用的RAID级别的基础上,介绍了一种新型的对角线奇偶校验方法,结合水平奇偶校验,可以防止两块磁盘损坏。通过可靠的数学分析,可以看到该方法可以极大提高磁  相似文献   

16.
Redundancy is the basic technique to provide reliability in storage systems consisting of multiple components. A redundancy scheme defines how the redundant data are produced and maintained. The simplest redundancy scheme is replication, which however suffers from storage inefficiency. Another approach is erasure coding, which provides the same level of reliability as replication using a significantly smaller amount of storage. When redundant data are lost, they need to be replaced. While replacing replicated data consists in a simple copy, it becomes a complex operation with erasure codes: new data are produced performing a coding over some other available data. The amount of data to be read and coded is d times larger than the amount of data produced, where d, called repair degree, is larger than 1 and depends on the structure of the code. This implies that coding has a larger computational and I/O cost, which, for distributed storage systems, translates into increased network traffic. Participants of Peer-to-Peer systems often have ample storage and CPU power, but their network bandwidth may be limited. For these reasons existing coding techniques are not suitable for P2P storage. This work explores the design space between replication and the existing erasure codes. We propose and evaluate a new class of erasure codes, called Hierarchical Codes, which allows to reduce the network traffic due to maintenance without losing the benefits given by traditional erasure codes.  相似文献   

17.
本文提出了基于编码机制的网格数据复制思想,通过对副本数据进行线性分组编码,并将其分散保存到网格存储节点,可形成具有纠删能力的编码子副本组.针对目前热点研究的线性分组编码,探讨基于Cauchy Reed-Solo-mon Code、Tornado Code和Random Linear Code的编码数据复制方案,通过建模手段讨论三者的副本数据访问性能和副本数据可靠性,并与传统的完整数据复制和分块数据复制进行时比分析,证明所提出的编码数据复制有着较优的综合性能.具体实验数据进一步说明,编码副本的编码开销占整个数据复制开销的较小比例,表明编码数据复制是具有可行性的技术方案.  相似文献   

18.
主存键值(key-value,KV)数据库具有高效性、易用性和可扩展性.由于主存容量有限,一些数据量较大的应用必须使用磁盘进行数据交换.而固态硬盘(solid state disk,SSD)有高速的随机读特点,使用固态硬盘作为主存KV数据库的虚拟内存会提高对不在主存中的数据的读性能.但是固态硬盘的随机写性能较差,于是提...  相似文献   

19.
RAID has long been established as an effective way to provide highly reliable as well as high-performance disk subsystems. However, reliability in RAID systems comes at the cost of extra disks. In this paper, we describe a mechanism that we have termed RAID0.5 that enables striped disks with very high data reliability but low disk cost. We take advantage of the fact that most disk systems use offline backup systems for disaster recovery. With the use of these offline backup systems, the disk system needs to only replicate data since the last backup, thus drastically reducing the storage space requirement. Though RAID0.5 has the same data loss characteristics of traditional mirroring, the lower storage space comes at the cost of lower availability. Thus, RAID0.5 is a tradeoff between lower disk cost and lower availability while still preserving very high data reliability. We present analytical reliability models and experimental results that demonstrate the enhanced reliability and performance of the proposed RAID0.5 system.  相似文献   

20.
张航  唐聃  蔡红亮 《计算机科学》2021,48(5):130-139
纠删码消耗的存储空间较少,获得的数据可靠性较高,因此被分布式存储系统广泛采用。但纠删码在修复数据时较高的修复成本限制了其应用。为了降低纠删码的修复成本,研究人员在分组码和再生码上进行了大量的研究。由于分组码和再生码属于被动容错方式,对于一些容易出现失效的节点,采用主动容错的方式能更好地降低修复成本,维护系统的可靠性,因此,提出了一种主动容错的预测式纠删(Proactive basic-Pyramid, PPyramid)码。PPyramid码利用硬盘故障预测方法来调整basic-Pyramid码中冗余块和数据块之间的关联,将预测出的即将出现故障的硬盘划分到同一小组,使得在修复数据时,所有的读取操作在小组内进行,从而减少读取数据块的个数,节省修复成本。在基于Ceph搭建的分布式存储系统中,在修复多个硬盘故障时,将PPyramid码与其他常用的纠删码进行对比。实验结果表明,相比basic-Pyramid码,PPyramid码能降低6.3%~34.9%的修复成本和减少7.6%~63.6%的修复时间,相比LRC码、pLRC码、SHEC码、DLRC码,能降低8.6%~52%的修复成本和减少10....  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号