一种混合局部恢复码及Hitchhiker码的存储策略 A Local Reconstruction Code and Hitchhiker Code Mixing Storage Scheme期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种混合局部恢复码及Hitchhiker码的存储策略

引用本文：	王梓仲,王海霞,邵艾然,汪东升.一种混合局部恢复码及Hitchhiker码的存储策略[J].计算机学报,2020,43(4):618-630.

作者姓名：	王梓仲王海霞邵艾然汪东升

作者单位：	清华大学计算机科学与技术系北京 100084;清华大学北京信息科学与技术国家研究中心北京 100084;清华大学北京信息科学与技术国家研究中心北京 100084

基金项目：	国家重点研发计划;广东省重点研发计划项目

摘要：	我们正处于一个大数据的时代.如今一个分布式存储系统需要存放PB数量级数据的情况越来越常见.这些系统一般由普通商用组件构成,其出错率相对较高.由此,分布式存储系统需要保证数据的可靠性和可用性.多副本和纠删码是现在最为常用的技术.相比多副本技术,采用纠删码能在同等容错能力下大幅降低存储开销.然而,在进行数据恢复时,使用传统的纠删码(如Reed-Solomon码)会导致系统中产生大量的网络带宽消耗及磁盘读写操作,进而导致退化读延迟过高.注意到在系统中数据的访问频率呈Zipf分布,大多数数据访问只涉及到少量数据,而绝大多数数据的被访频率很低.根据这种数据访问的偏斜性,本文提出如下存储策略以解决采用纠删码的系统退化读延迟过高的问题:对被访频率高的热数据采用低恢复延迟的纠删码(如局部恢复码Local Reconstruction Code,LRC)进行编码,而对被访频率低的冷数据采用保证最小存储开销的纠删码(如Hitchhiker码)进行编码.由于热数据占据了绝大多数的数据访问,因此绝大多数的退化读也将应用在这些热数据上,这样这一策略就能在整个系统的角度获取低恢复开销的优势.同时,冷数据占据了系统绝大多数的数据量,且冷数据由保证最小存储开销的编码进行存储,因此这一策略的存储开销会很低.然而,对于混合存储策略而言,热数据可能会变冷,而冷数据也可能会变热,因此它需要配置一种编码切换过程.一个不恰当的编码切换过程会引起巨大的数据传输量,这是难以让人接受的.为了避免这一缺陷,本文提出了一种LRC和Hitchhiker码之间的高效切换算法.这一算法可以避免上述策略在部署时因冷热数据的转换出现系统瓶颈.在精心选取了两种编码并提出它们之间的高效切换算法后,本文提出的混合存储策略避免了现阶段其余混合存储策略的主要缺点.通过实验验证,此存储策略相较传统的Reed-Solomon码在退化读延迟方面降低了55.8%.在编码切换方面,切换延迟能分别降低为重新编码算法用时的13.4%及33.1%,且当数据从LRC切换为Hitchhiker码时(更为频繁出现的情况)的数据传输量能降至10%.
关键词：	纠删码退化读编码切换容错存储
A Local Reconstruction Code and Hitchhiker Code Mixing Storage Scheme

WANG Zi-Zhong,WANG Hai-Xia,SHAO Ai-Ran,WANG Dong-Sheng.A Local Reconstruction Code and Hitchhiker Code Mixing Storage Scheme[J].Chinese Journal of Computers,2020,43(4):618-630.

Authors:	WANG Zi-Zhong WANG Hai-Xia SHAO Ai-Ran WANG Dong-Sheng

Affiliation:	(Department of Computer Science and Technology,Tsinghua University,Beijing 100084;Beijing National Research Center for Information Science and Technology,Tsinghua University,Beijing 100084)

Abstract:	We are now living in an era of big data.It is more and more common that distributed storage systems store multiple petabytes of data today.These systems are typically built on commodity components,where faults are more likely to occur.It is the responsibility of the distributed storage systems to ensure that data can be stored reliably and durably.Replication and erasure codes are the most common solutions.Under the same level of fault tolerance,using erasure codes can significantly reduce storage cost compared with using replication.However,using traditional erasure codes,like Reed-Solomon codes,increases the consumption of network traffic and disk I/O tremendously when systems recover data,and thus will result in high latency of degraded reads.Noticed that data access frequency is Zipf distribution,most data accesses are applied in a small fraction of data,while most data’s accessing frequencies are very low.According to this data access skew,this paper presents a mixing storage scheme to solve the high degraded read latency problem of erasure-coded storage systems.In this scheme,two erasure codes are used.A code which has the characteristic of low recovery cost,like Local Reconstruction Code,is used to store frequently accessed data(hot data).Another code which guarantees minimum storage cost,like Hitchhiker code,is used to store infrequently accessed data(cold data).Because hot data occupy most data accesses,most degraded reads should be applied on hot data,then this scheme can have an advantage of low recovery cost from the perspective of the entire system.In the meantime,cold data dominate the whole system in data amount’s aspect,then this scheme can have a low storage overhead since cold data are stored by a code that guarantees minimum storage cost.However,there must be some situations in which hot data become cold or cold data become hot.Therefore the mixing storage scheme should be deployed with a code switching procedure.An inappropriate code switching procedure may raise a huge amount of transformation,which is unacceptable.In order to prevent this defect,this paper presents an efficient switching algorithm between Local Reconstruction Code and Hitchhiker code.With this algorithm,the switches between hot data and cold data will not become a bottleneck in the system.While choosing the codes carefully and deploying the efficient code switching algorithm,the mixing storage scheme that this paper presents can be a flawless scheme compared to other existing mixing storage schemes.Through experimental verification,using this storage scheme can reduce 55.8%of degraded read latency compared to traditional Reed-Solomon code.Moreover,when applying this switching algorithm,latency of switching can be reduced to 13.4%and 33.1%respectively.Also,when switching Local Reconstruction Code to Hitchhiker code,which will be the most common situation,the amount of data transferred can be reduced to 10%of the original algorithm by the newly presented code switching algorithm.

Keywords:	erasure code degraded read code switch fault tolerance storage
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏