基于简单再生码的带宽感知的分布式存储节点修复优化 Bandwidth-Aware Node Repair Optimization for Distributed Storage System Based on Simple Regenerating Code期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于简单再生码的带宽感知的分布式存储节点修复优化

引用本文：	丁尚,童鑫,陈艳,叶保留.基于简单再生码的带宽感知的分布式存储节点修复优化[J].软件学报,2017,28(8):1940-1951.

作者姓名：	丁尚童鑫陈艳叶保留

作者单位：	计算机软件新技术国家重点实验室(南京大学), 江苏南京 210023,计算机软件新技术国家重点实验室(南京大学), 江苏南京 210023,上海市电力公司, 上海 200122,计算机软件新技术国家重点实验室(南京大学), 江苏南京 210023

基金项目：	国家自然科学基金(61373014)

摘要：	分布式存储系统为保证可靠性会采用一定存储冗余策略如多副本策略、纠删码策略.纠删码相对于副本具有存储开销小的优点，但节点修复网络开销大.针对修复网络开销优化，业界提出再生码与以简单再生码为代表的局部可修复码，显著降低了修复网络开销.然而，现有基于编码的分布式容错存储方案大都假设节点处于星型逻辑网络结构中，忽略了实际的物理网络拓扑结构和带宽信息.为实现拓扑感知的容错存储优化，相关研究在纠删码和再生码修复过程结合网络链路带宽能力，建立树型修复路径，进一步提高了修复效率.但由于编码和修复过程的差异性，上述工作并不适合于简单再生码修复.针对该问题，本文结合实际物理网络拓扑结构，将链路带宽能力引入到简单再生码的修复过程中，对带宽感知的简单再生码修复优化技术开展研究.论文建立了带宽感知节点修复时延模型，提出了基于最优瓶颈路径和最优修复树的并行修复树构建算法.并通过实验对所提算法性能进行了评估.实验结果表明，与星型修复方式相比，论文所提算法有效地降低了节点修复时延，提高了修复效率.
关键词：	分布式存储简单再生码网络拓扑节点修复
收稿时间：	2016/8/18 0:00:00
修稿时间：	2016/12/1 0:00:00
Bandwidth-Aware Node Repair Optimization for Distributed Storage System Based on Simple Regenerating Code

DING Shang,TONG Xin,CHEN Yan and YE Bao-Liu.Bandwidth-Aware Node Repair Optimization for Distributed Storage System Based on Simple Regenerating Code[J].Journal of Software,2017,28(8):1940-1951.

Authors:	DING Shang TONG Xin CHEN Yan and YE Bao-Liu

Affiliation:	National Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China,National Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China,State Grid Shanghai Municipal Electric Power Company, Shanghai 200122, China and National Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210023, China

Abstract:	In order to ensure the data reliability, fault-tolerant distributed storage systems usually apply some data redundant strategies such as the multi-replicas strategy and the erasure coding strategy. Compared with the multi-replicas strategy, the erasure coding strategy has an advantage of storing less data while costs much more network traffic when repairing a failed node. To reduce the repair network cost, the Regenerating Code and Locally Repairable Codes were proposed, and Simple Regenerating Code is a representative of the Locally Repairable Codes. However, most of the current fault-tolerant distributed storage methods based on coding strategy assume that the storage nodes are located in the star shaped logic network ignoring the physical network topology and link bandwidth capacity. So with applying the physical network topology into the repair process of erasure code and Regenerating Code, some related researches propose methods of building tree shaped repair paths to get further more efficiency for repairing a failed node. But because of the difference in encoding and repairing process, those methods are not fit for the repair of Simple Regenerating Code. To resolve the problem, in this paper we introduce the link bandwidth capacity into the repair process of Simple Regenerating Code based on the physical network topology. We build the bandwidth-aware node repair analysis model, and propose the algorithm to build parallel repair trees based on the optimal bottleneck path and optimal repair tree methods. And we design experiments to evaluate the performance of our algorithm as well. The result show that compared with the method based on star shaped network topology, our algorithm reduces the latency of repair process effectively.

Keywords:	Distributed Storage Simple Regenerating Coding Network Topology Node Repair

	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏