首页 | 本学科首页   官方微博 | 高级检索  
     

面向分布式文件存储系统的数据恢复策略
引用本文:胡至洵.面向分布式文件存储系统的数据恢复策略[J].中州煤炭,2018,0(3):131-137.
作者姓名:胡至洵
作者单位:(山东大学图书馆,山东 济南〓250101)
摘    要:分布式存储系统构建于大量的廉价节点之上,使得节点失效成为一种常态。为了保证数据的可靠性,系统必须具备数据容错方案。纠删码冗余方案可以在提供更低的存储开销的同时,获得和副本冗余方案相同的可靠性。但是在实际运用中,基于纠删码的存储系统在恢复数据时,恢复节点需要从多个存活节点读取数据到本地,然后通过解码算法恢复出数据。这不仅对恢复节点造成了较大压力,而且会占据大量的网络带宽,影响系统整体性能。由此,提出了一种基于纠删码的存储系统数据恢复优化方法。通过对纠删码恢复算法的分析,证明了纠删码的恢复操作是可以并行的;设计了一种基于流水线的并行化数据恢复方案;通过分析现实中的网络拓扑结构,设计了一种可以最小化恢复过程中数据传输总长度的算法,提高网络中高层数据链路利用率。实验表明,相比目前存在的星型恢复方式,流水线式并行恢复方法可以显著降低数据恢复延时,提高恢复效率。

关 键 词:分布式存储  纠删码  流水线并行化  网络拓扑

 Data recovery strategy for distributed file storage system
Hu Zhixun. Data recovery strategy for distributed file storage system[J].Zhongzhou Coal,2018,0(3):131-137.
Authors:Hu Zhixun
Affiliation:(Library of Shandong University,Jinan 250101,China)
Abstract:The distributed storage system was built on a large number of cheap nodes,making the node failure become a normal state.In order to ensure the reliability of the data,the system must have data fault tolerance scheme.Erasure Code schemes can achieve the same reliability as replica redundancy while providing lower storage overhead.However,in practical use,the storage system based on the erasure code,the recovery node need to read data from multiple surviving nodes to the local disk,then recover the data through the decoding algorithm.This not only put a lot of pressure on the recovery node,but also occupied a lot of network bandwidth,affecting the overall system performance.Thus,this paper presented a method of data recovery and optimization of storage system based on erasure code.Firstly,through the analysis of the algorithm to erasure code,it was proved that the recovery operation of the erasure code can be parallel,then,a parallel data recovery scheme based on pipeline is designed,finally,by analyzing the reality of the network topology,an algorithm to minimize the total length of data transmission in the recovery process was designed to improve the utilization of high level data links in the network.Experiments showed that compared with the existing star recovery method,the proposed pipeline parallel recovery method can significantly reduce the data recovery delay and improve the recovery efficiency.
Keywords:,distributed storage, erasure code, pipeline parallelization, network topology
本文献已被 CNKI 等数据库收录!
点击此处可从《中州煤炭》浏览原始摘要信息
点击此处可从《中州煤炭》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号